date:20201104

[jira] [Comment Edited] (SOLR-14034) remove deprecated min_rf references

2020-11-04 Thread Tim Dillon (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17223427#comment-17223427
 ] 

Tim Dillon edited comment on SOLR-14034 at 11/4/20, 8:16 AM:
-

[~marcussorealheis] I see it has been a while since there was any activity on 
this, is it still available to work on? I'm new here as well and this seems 
like a good place to get started.

So I was looking at this and found something I'm not sure about. There's two 
methods that take minRf as a parameter, sendDocs and sendDocsWithRetry (linked 
below), that are called in ReplicationFactorTest.java multiple times.
 * 
[https://github.com/apache/lucene-solr/blob/master/solr/core/src/test/org/apache/solr/cloud/ReplicationFactorTest.java#L462]
 * 
[https://github.com/apache/lucene-solr/blob/master/solr/test-framework/src/java/org/apache/solr/cloud/AbstractFullDistribZkTestBase.java#L938]

However, I also found calls to these methods that pass expectedRf and 
expectedRfDBQ instead of minRf.
 * 
[https://github.com/apache/lucene-solr/blob/master/solr/core/src/test/org/apache/solr/cloud/ReplicationFactorTest.java#L417]
 * 
[https://github.com/apache/lucene-solr/blob/master/solr/core/src/test/org/apache/solr/cloud/ReplicationFactorTest.java#L427]
 * 
[https://github.com/apache/lucene-solr/blob/master/solr/core/src/test/org/apache/solr/cloud/ReplicationFactorTest.java#L230]

Is it necessary to pass expectedRf to these methods? The tests still passed 
after refactoring the two methods and removing the minRf/expectedRf parameters, 
but I'm not sure if that would invalidate these tests or not.

If it is indeed necessary to pass the expectedRf parameters, should I leave 
these methods as is? Or would it make sense to create overloaded methods to 
handle the few calls with the additional expectedRf parameter?

Sorry for all the questions, just wanted to make sure I'm removing _all_ 
references to minRf without causing problems elsewhere.

 


was (Author: trdillon):
[~marcussorealheis] I see it has been a while since there was any activity on 
this, is it still available to work on? I'm new here as well and this seems 
like a good place to get started.

> remove deprecated min_rf references
> ---
>
> Key: SOLR-14034
> URL: https://issues.apache.org/jira/browse/SOLR-14034
> Project: Solr
>  Issue Type: Task
>Reporter: Christine Poerschke
>Priority: Blocker
>  Labels: newdev
> Fix For: master (9.0)
>
>
> * {{min_rf}} support was added under SOLR-5468 in version 4.9 
> (https://github.com/apache/lucene-solr/blob/releases/lucene-solr/4.9.0/solr/solrj/src/java/org/apache/solr/client/solrj/request/UpdateRequest.java#L50)
>  and deprecated under SOLR-12767 in version 7.6 
> (https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.6.0/solr/solrj/src/java/org/apache/solr/client/solrj/request/UpdateRequest.java#L57-L61)
> * http://lucene.apache.org/solr/7_6_0/changes/Changes.html and 
> https://lucene.apache.org/solr/guide/8_0/major-changes-in-solr-8.html#solr-7-6
>  both clearly mention the deprecation
> This ticket is to fully remove {{min_rf}} references in code, tests and 
> documentation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14983) Score returned in search request is original score and not reranked score

2020-11-04 Thread Krishan (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17225923#comment-17225923
 ] 

Krishan commented on SOLR-14983:


Yea looks good. Thanks [~cpoerschke] and [~Baik] for the fix and test cases. 

> Score returned in search request is original score and not reranked score
> -
>
> Key: SOLR-14983
> URL: https://issues.apache.org/jira/browse/SOLR-14983
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 8.0
>Reporter: Krishan
>Priority: Major
> Attachments: 0001-LUCENE-9542-Unit-test-to-reproduce-bug.patch, 
> SOLR-14983.patch
>
>
> Score returned in search request is original score and not reranked score 
> post the changes in https://issues.apache.org/jira/browse/LUCENE-8412.
> Commit - 
> [https://github.com/apache/lucene-solr/commit/55bfadbce115a825a75686fe0bfe71406bc3ee44#diff-4e354f104ed52bd7f620b0c05ae8467d]
> Specifically - 
> if (cmd.getSort() != null && query instanceof RankQuery == false && 
> (cmd.getFlags() & GET_SCORES) != 0) {
>     TopFieldCollector.populateScores(topDocs.scoreDocs, this, query);
> }
> in SolrIndexSearcher.java recomputes the score but outputs only the original 
> score and not the reranked score.
>  
> The issue is cmd.getQuery() is a type of RankQuery but the "query" variable 
> is a boolean query and probably replacing query with cmd.getQuery() should be 
> the right fix for this so that the score is not overriden for rerank queries
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14983) Score returned in search request is original score and not reranked score

2020-11-04 Thread Christine Poerschke (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke updated SOLR-14983:
---
Attachment: SOLR-14983.patch

> Score returned in search request is original score and not reranked score
> -
>
> Key: SOLR-14983
> URL: https://issues.apache.org/jira/browse/SOLR-14983
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 8.0
>Reporter: Krishan
>Priority: Major
> Attachments: 0001-LUCENE-9542-Unit-test-to-reproduce-bug.patch, 
> SOLR-14983.patch, SOLR-14983.patch
>
>
> Score returned in search request is original score and not reranked score 
> post the changes in https://issues.apache.org/jira/browse/LUCENE-8412.
> Commit - 
> [https://github.com/apache/lucene-solr/commit/55bfadbce115a825a75686fe0bfe71406bc3ee44#diff-4e354f104ed52bd7f620b0c05ae8467d]
> Specifically - 
> if (cmd.getSort() != null && query instanceof RankQuery == false && 
> (cmd.getFlags() & GET_SCORES) != 0) {
>     TopFieldCollector.populateScores(topDocs.scoreDocs, this, query);
> }
> in SolrIndexSearcher.java recomputes the score but outputs only the original 
> score and not the reranked score.
>  
> The issue is cmd.getQuery() is a type of RankQuery but the "query" variable 
> is a boolean query and probably replacing query with cmd.getQuery() should be 
> the right fix for this so that the score is not overriden for rerank queries
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14983) Score returned in search request is original score and not reranked score

2020-11-04 Thread Christine Poerschke (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17225928#comment-17225928
 ] 

Christine Poerschke commented on SOLR-14983:


Attached unit test tweak:
 * the SolrIndexSearcher change is in two places yet the unit test previously 
did not reflect that. Either {{getDocListAndSetNC}} or {{getDocListNC}} is 
called based on the presence or absence of the {{GET_DOCSET}} flag – e.g. 
[https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.7.0/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L1406-L1414]
 – and so the unit test now covers both flag possibilities.


Looking ahead to patch commit time:
 * [~krishan1390], do you have any preference as to {{solr/CHANGES.txt}} 
attribution choice of name for yourself e.g. "Krishan" (current JIRA full name) 
or "krishan1390" (JIRA Username) or a variant of the two or something else 
perhaps? It's entirely your choice, different people use different styles e.g. 
[https://lucene.apache.org/solr/8_6_3/changes/Changes.html] for the 8.6.3 
release.
 * [~Baik], I'm going to assume you wish to use "Jason Baik" (current JIRA full 
name) for your {{solr/CHANGES.txt}} attribution choice of name but if you 
prefer a different style, please let me know.
 * "precommit" checks now pass (two unused imports needed removing) and I'm yet 
to fully run all the solr test suites but assuming they pass and if there's no 
further comments or concerns etc. here then I'll aim to commit the change 
middle of next week hopefully.

> Score returned in search request is original score and not reranked score
> -
>
> Key: SOLR-14983
> URL: https://issues.apache.org/jira/browse/SOLR-14983
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 8.0
>Reporter: Krishan
>Priority: Major
> Attachments: 0001-LUCENE-9542-Unit-test-to-reproduce-bug.patch, 
> SOLR-14983.patch, SOLR-14983.patch
>
>
> Score returned in search request is original score and not reranked score 
> post the changes in https://issues.apache.org/jira/browse/LUCENE-8412.
> Commit - 
> [https://github.com/apache/lucene-solr/commit/55bfadbce115a825a75686fe0bfe71406bc3ee44#diff-4e354f104ed52bd7f620b0c05ae8467d]
> Specifically - 
> if (cmd.getSort() != null && query instanceof RankQuery == false && 
> (cmd.getFlags() & GET_SCORES) != 0) {
>     TopFieldCollector.populateScores(topDocs.scoreDocs, this, query);
> }
> in SolrIndexSearcher.java recomputes the score but outputs only the original 
> score and not the reranked score.
>  
> The issue is cmd.getQuery() is a type of RankQuery but the "query" variable 
> is a boolean query and probably replacing query with cmd.getQuery() should be 
> the right fix for this so that the score is not overriden for rerank queries
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14983) Score returned in search request is original score and not reranked score

2020-11-04 Thread Christine Poerschke (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke updated SOLR-14983:
---
Status: Patch Available  (was: Open)

> Score returned in search request is original score and not reranked score
> -
>
> Key: SOLR-14983
> URL: https://issues.apache.org/jira/browse/SOLR-14983
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 8.0
>Reporter: Krishan
>Assignee: Christine Poerschke
>Priority: Major
> Attachments: 0001-LUCENE-9542-Unit-test-to-reproduce-bug.patch, 
> SOLR-14983.patch, SOLR-14983.patch
>
>
> Score returned in search request is original score and not reranked score 
> post the changes in https://issues.apache.org/jira/browse/LUCENE-8412.
> Commit - 
> [https://github.com/apache/lucene-solr/commit/55bfadbce115a825a75686fe0bfe71406bc3ee44#diff-4e354f104ed52bd7f620b0c05ae8467d]
> Specifically - 
> if (cmd.getSort() != null && query instanceof RankQuery == false && 
> (cmd.getFlags() & GET_SCORES) != 0) {
>     TopFieldCollector.populateScores(topDocs.scoreDocs, this, query);
> }
> in SolrIndexSearcher.java recomputes the score but outputs only the original 
> score and not the reranked score.
>  
> The issue is cmd.getQuery() is a type of RankQuery but the "query" variable 
> is a boolean query and probably replacing query with cmd.getQuery() should be 
> the right fix for this so that the score is not overriden for rerank queries
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Assigned] (SOLR-14983) Score returned in search request is original score and not reranked score

2020-11-04 Thread Christine Poerschke (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke reassigned SOLR-14983:
--

Assignee: Christine Poerschke

> Score returned in search request is original score and not reranked score
> -
>
> Key: SOLR-14983
> URL: https://issues.apache.org/jira/browse/SOLR-14983
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 8.0
>Reporter: Krishan
>Assignee: Christine Poerschke
>Priority: Major
> Attachments: 0001-LUCENE-9542-Unit-test-to-reproduce-bug.patch, 
> SOLR-14983.patch, SOLR-14983.patch
>
>
> Score returned in search request is original score and not reranked score 
> post the changes in https://issues.apache.org/jira/browse/LUCENE-8412.
> Commit - 
> [https://github.com/apache/lucene-solr/commit/55bfadbce115a825a75686fe0bfe71406bc3ee44#diff-4e354f104ed52bd7f620b0c05ae8467d]
> Specifically - 
> if (cmd.getSort() != null && query instanceof RankQuery == false && 
> (cmd.getFlags() & GET_SCORES) != 0) {
>     TopFieldCollector.populateScores(topDocs.scoreDocs, this, query);
> }
> in SolrIndexSearcher.java recomputes the score but outputs only the original 
> score and not the reranked score.
>  
> The issue is cmd.getQuery() is a type of RankQuery but the "query" variable 
> is a boolean query and probably replacing query with cmd.getQuery() should be 
> the right fix for this so that the score is not overriden for rerank queries
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-8673) o.a.s.search.facet classes not public/extendable

2020-11-04 Thread Christine Poerschke (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17225936#comment-17225936
 ] 

Christine Poerschke commented on SOLR-8673:
---

{quote}... work done in SOLR-14482 has extracted FacetContext to a top-level 
class and made it public at the same time.
{quote}
{quote}... That class is now public since 8.6.. but all its fields have 
package-visibility and there are not even any get methods, so it's impossible 
to use it.
{quote}
Hi [~TimOwen]! From the above my guess is "that class" is {{FacetContext}} i.e. 
[https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.7.0/solr/core/src/java/org/apache/solr/search/facet/FacetContext.java?]
 Looks like so far only the {{debugInfo}} package visible member has get/set 
accessors, hmm. I'm unclear on which of the other members would be useful in 
custom code, would you be able to open a pull request or create a patch for 
this ticket perhaps re: that? Thanks!

> o.a.s.search.facet classes not public/extendable
> 
>
> Key: SOLR-8673
> URL: https://issues.apache.org/jira/browse/SOLR-8673
> Project: Solr
>  Issue Type: Improvement
>  Components: Facet Module
>Affects Versions: 5.4.1
>Reporter: Markus Jelsma
>Priority: Major
> Fix For: 6.2, 7.0
>
>
> It is not easy to create a custom JSON facet function. A simple function 
> based on AvgAgg quickly results in the following compilation failures:
> {code}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.3:compile (default-compile) 
> on project openindex-solr: Compilation failure: Compilation failure:
> [ERROR] 
> /home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[22,36]
>  org.apache.solr.search.facet.FacetContext is not public in 
> org.apache.solr.search.facet; cannot be accessed from outside package
> [ERROR] 
> /home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[23,36]
>  org.apache.solr.search.facet.FacetDoubleMerger is not public in 
> org.apache.solr.search.facet; cannot be accessed from outside package
> [ERROR] 
> /home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[40,32]
>  cannot find symbol
> [ERROR] symbol:   class FacetContext
> [ERROR] location: class i.o.s.search.facet.CustomAvgAgg
> [ERROR] 
> /home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[49,39]
>  cannot find symbol
> [ERROR] symbol:   class FacetDoubleMerger
> [ERROR] location: class i.o.s.search.facet.CustomAvgAgg
> [ERROR] 
> /home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[54,43]
>  cannot find symbol
> [ERROR] symbol:   class Context
> [ERROR] location: class i.o.s.search.facet.CustomAvgAgg.Merger
> [ERROR] 
> /home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[41,16]
>  cannot find symbol
> [ERROR] symbol:   class AvgSlotAcc
> [ERROR] location: class i.o.s.search.facet.CustomAvgAgg
> [ERROR] 
> /home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[46,12]
>  incompatible types: i.o.s.search.facet.CustomAvgAgg.Merger cannot be 
> converted to org.apache.solr.search.facet.FacetMerger
> [ERROR] 
> /home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[53,5]
>  method does not override or implement a method from a supertype
> [ERROR] 
> /home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[60,5]
>  method does not override or implement a method from a supertype
> {code}
> It seems lots of classes are tucked away in FacetModule, which we can't reach 
> from outside.
> Originates from this thread: 
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201602.mbox/%3ccab_8yd9ldbg_0zxm_h1igkfm6bqeypd5ilyy7tty8cztscv...@mail.gmail.com%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-8673) o.a.s.search.facet classes not public/extendable

2020-11-04 Thread Christine Poerschke (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke updated SOLR-8673:
--
Description: 
It is not easy to create a custom JSON facet function. A simple function based 
on AvgAgg quickly results in the following compilation failures:

{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.3:compile (default-compile) on 
project openindex-solr: Compilation failure: Compilation failure:
[ERROR] 
/home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[22,36]
 org.apache.solr.search.facet.FacetContext is not public in 
org.apache.solr.search.facet; cannot be accessed from outside package
[ERROR] 
/home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[23,36]
 org.apache.solr.search.facet.FacetDoubleMerger is not public in 
org.apache.solr.search.facet; cannot be accessed from outside package
[ERROR] 
/home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[40,32]
 cannot find symbol
[ERROR] symbol:   class FacetContext
[ERROR] location: class i.o.s.search.facet.CustomAvgAgg
[ERROR] 
/home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[49,39]
 cannot find symbol
[ERROR] symbol:   class FacetDoubleMerger
[ERROR] location: class i.o.s.search.facet.CustomAvgAgg
[ERROR] 
/home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[54,43]
 cannot find symbol
[ERROR] symbol:   class Context
[ERROR] location: class i.o.s.search.facet.CustomAvgAgg.Merger
[ERROR] 
/home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[41,16]
 cannot find symbol
[ERROR] symbol:   class AvgSlotAcc
[ERROR] location: class i.o.s.search.facet.CustomAvgAgg
[ERROR] 
/home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[46,12]
 incompatible types: i.o.s.search.facet.CustomAvgAgg.Merger cannot be converted 
to org.apache.solr.search.facet.FacetMerger
[ERROR] 
/home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[53,5]
 method does not override or implement a method from a supertype
[ERROR] 
/home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[60,5]
 method does not override or implement a method from a supertype
{code}

It seems lots of classes are tucked away in FacetModule, which we can't reach 
from outside.

Originates from this thread: 
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201602.mbox/%3ccab_8yd9ldbg_0zxm_h1igkfm6bqeypd5ilyy7tty8cztscv...@mail.gmail.com%3E
 ( also available at 
https://lists.apache.org/thread.html/9fddcad3136ec908ce1c57881f8d3069e5d153f08b71f80f3e18d995%401455019826%40%3Csolr-user.lucene.apache.org%3E
 )

  was:
It is not easy to create a custom JSON facet function. A simple function based 
on AvgAgg quickly results in the following compilation failures:

{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.3:compile (default-compile) on 
project openindex-solr: Compilation failure: Compilation failure:
[ERROR] 
/home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[22,36]
 org.apache.solr.search.facet.FacetContext is not public in 
org.apache.solr.search.facet; cannot be accessed from outside package
[ERROR] 
/home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[23,36]
 org.apache.solr.search.facet.FacetDoubleMerger is not public in 
org.apache.solr.search.facet; cannot be accessed from outside package
[ERROR] 
/home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[40,32]
 cannot find symbol
[ERROR] symbol:   class FacetContext
[ERROR] location: class i.o.s.search.facet.CustomAvgAgg
[ERROR] 
/home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[49,39]
 cannot find symbol
[ERROR] symbol:   class FacetDoubleMerger
[ERROR] location: class i.o.s.search.facet.CustomAvgAgg
[ERROR] 
/home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[54,43]
 cannot find symbol
[ERROR] symbol:   class Context
[ERROR] location: class i.o.s.search.facet.CustomAvgAgg.Merger
[ERROR] 
/home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[41,16]
 cannot find symbol
[ERROR] symbol:   class AvgSlotAcc
[ERROR] location: class i.o.s.search.facet.CustomAvgAgg
[ERROR] 
/home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[46,12]
 incompatible types: i.o.s.search.facet.CustomAvgAgg.Merger cannot be converted 
to org.apache.solr.search.facet.FacetMerger
[ERROR] 
/home/markus/projects/openindex/solr/trunk/src/ma

[GitHub] [lucene-solr] murblanc commented on a change in pull request #1962: SOLR-14749 Provide a clean API for cluster-level event processing

2020-11-04 Thread GitBox



murblanc commented on a change in pull request #1962:
URL: https://github.com/apache/lucene-solr/pull/1962#discussion_r517207532



##
File path: 
solr/core/src/java/org/apache/solr/cluster/events/impl/ClusterEventProducerFactory.java
##
@@ -0,0 +1,173 @@
+package org.apache.solr.cluster.events.impl;
+
+import org.apache.solr.api.ContainerPluginsRegistry;
+import org.apache.solr.cluster.events.ClusterEvent;
+import org.apache.solr.cluster.events.ClusterEventListener;
+import org.apache.solr.cluster.events.ClusterEventProducer;
+import org.apache.solr.cluster.events.ClusterEventProducerBase;
+import org.apache.solr.cluster.events.NoOpProducer;
+import org.apache.solr.core.CoreContainer;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+import java.util.Set;
+
+/**
+ * This class helps in handling the initial registration of plugin-based 
listeners,
+ * when both the final {@link ClusterEventProducer} implementation and 
listeners
+ * are configured using plugins.
+ */
+public class ClusterEventProducerFactory extends ClusterEventProducerBase {
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  private ContainerPluginsRegistry.PluginRegistryListener 
initialPluginListener;
+  private boolean created = false;
+
+  public ClusterEventProducerFactory(CoreContainer cc) {
+super(cc);
+initialPluginListener = new 
ContainerPluginsRegistry.PluginRegistryListener() {

Review comment:
   Wouldn't code be easier to read/maintain if named classes were defined 
rather than anonymous ones? They would have names that help understand their 
function, class javadoc etc... I feel the anonymous classes in this PR have 
enough content to justify a non anonymous implementation. That's likely just a 
personal preference of mine so feel free to ignore.
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] murblanc commented on a change in pull request #1962: SOLR-14749 Provide a clean API for cluster-level event processing

2020-11-04 Thread GitBox



murblanc commented on a change in pull request #1962:
URL: https://github.com/apache/lucene-solr/pull/1962#discussion_r517211398



##
File path: 
solr/core/src/java/org/apache/solr/cluster/events/impl/ClusterEventProducerFactory.java
##
@@ -0,0 +1,173 @@
+package org.apache.solr.cluster.events.impl;
+
+import org.apache.solr.api.ContainerPluginsRegistry;
+import org.apache.solr.cluster.events.ClusterEvent;
+import org.apache.solr.cluster.events.ClusterEventListener;
+import org.apache.solr.cluster.events.ClusterEventProducer;
+import org.apache.solr.cluster.events.ClusterEventProducerBase;
+import org.apache.solr.cluster.events.NoOpProducer;
+import org.apache.solr.core.CoreContainer;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+import java.util.Set;
+
+/**
+ * This class helps in handling the initial registration of plugin-based 
listeners,
+ * when both the final {@link ClusterEventProducer} implementation and 
listeners
+ * are configured using plugins.
+ */
+public class ClusterEventProducerFactory extends ClusterEventProducerBase {
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  private ContainerPluginsRegistry.PluginRegistryListener 
initialPluginListener;
+  private boolean created = false;
+
+  public ClusterEventProducerFactory(CoreContainer cc) {
+super(cc);
+initialPluginListener = new 
ContainerPluginsRegistry.PluginRegistryListener() {
+  @Override
+  public void added(ContainerPluginsRegistry.ApiInfo plugin) {
+if (plugin == null || plugin.getInstance() == null) {
+  return;
+}
+Object instance = plugin.getInstance();
+if (instance instanceof ClusterEventListener) {
+  registerListener((ClusterEventListener) instance);
+}
+  }
+
+  @Override
+  public void deleted(ContainerPluginsRegistry.ApiInfo plugin) {
+if (plugin == null || plugin.getInstance() == null) {
+  return;
+}
+Object instance = plugin.getInstance();
+if (instance instanceof ClusterEventListener) {
+  unregisterListener((ClusterEventListener) instance);
+}
+  }
+
+  @Override
+  public void modified(ContainerPluginsRegistry.ApiInfo old, 
ContainerPluginsRegistry.ApiInfo replacement) {
+added(replacement);
+deleted(old);
+  }
+};
+  }
+
+  @Override
+  public Set getSupportedEventTypes() {
+return NoOpProducer.ALL_EVENT_TYPES;
+  }
+
+  /**
+   * This method returns an initial plugin registry listener that helps to 
capture the
+   * freshly loaded listener plugins before the final cluster event producer 
is created.
+   * @return initial listener
+   */
+  public ContainerPluginsRegistry.PluginRegistryListener 
getPluginRegistryListener() {
+return initialPluginListener;
+  }
+
+  /**
+   * Create a {@link ClusterEventProducer} based on the current plugin 
configurations.
+   * NOTE: this method can only be called once because it has side-effects, 
such as
+   * transferring the initially collected listeners to the resulting 
producer's instance, and
+   * installing a {@link 
org.apache.solr.api.ContainerPluginsRegistry.PluginRegistryListener}.
+   * Calling this method more than once will result in an exception.
+   * @param plugins current plugin configurations
+   * @return configured instance of cluster event producer (with side-effects, 
see above)
+   */
+  public DelegatingClusterEventProducer create(ContainerPluginsRegistry 
plugins) {
+if (created) {
+  throw new RuntimeException("this factory can be called only once!");
+}
+final DelegatingClusterEventProducer clusterEventProducer = new 
DelegatingClusterEventProducer(cc);
+// since this is a ClusterSingleton, register it as such
+
cc.getClusterSingletons().getSingletons().put(ClusterEventProducer.PLUGIN_NAME 
+"_delegate", clusterEventProducer);
+ContainerPluginsRegistry.ApiInfo clusterEventProducerInfo = 
plugins.getPlugin(ClusterEventProducer.PLUGIN_NAME);
+if (clusterEventProducerInfo != null) {
+  // the listener in ClusterSingletons already registered it
+  clusterEventProducer.setDelegate((ClusterEventProducer) 
clusterEventProducerInfo.getInstance());
+} else {
+  // use the default NoOp impl
+}
+// transfer those listeners that were already registered to the initial 
impl
+transferListeners(clusterEventProducer, plugins);
+// install plugin registry listener
+ContainerPluginsRegistry.PluginRegistryListener pluginListener = new 
ContainerPluginsRegistry.PluginRegistryListener() {
+  @Override
+  public void added(ContainerPluginsRegistry.ApiInfo plugin) {
+if (plugin == null || plugin.getInstance() == null) {
+  return;
+}
+Object instance = plugin.getInstance();
+if (instance instanceof ClusterEventListener) {
+  ClusterEvent

[GitHub] [lucene-solr] murblanc commented on a change in pull request #1962: SOLR-14749 Provide a clean API for cluster-level event processing

2020-11-04 Thread GitBox



murblanc commented on a change in pull request #1962:
URL: https://github.com/apache/lucene-solr/pull/1962#discussion_r517211904



##
File path: 
solr/core/src/java/org/apache/solr/cluster/events/impl/ClusterEventProducerFactory.java
##
@@ -0,0 +1,173 @@
+package org.apache.solr.cluster.events.impl;
+
+import org.apache.solr.api.ContainerPluginsRegistry;
+import org.apache.solr.cluster.events.ClusterEvent;
+import org.apache.solr.cluster.events.ClusterEventListener;
+import org.apache.solr.cluster.events.ClusterEventProducer;
+import org.apache.solr.cluster.events.ClusterEventProducerBase;
+import org.apache.solr.cluster.events.NoOpProducer;
+import org.apache.solr.core.CoreContainer;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+import java.util.Set;
+
+/**
+ * This class helps in handling the initial registration of plugin-based 
listeners,
+ * when both the final {@link ClusterEventProducer} implementation and 
listeners
+ * are configured using plugins.
+ */
+public class ClusterEventProducerFactory extends ClusterEventProducerBase {
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  private ContainerPluginsRegistry.PluginRegistryListener 
initialPluginListener;
+  private boolean created = false;
+
+  public ClusterEventProducerFactory(CoreContainer cc) {
+super(cc);
+initialPluginListener = new 
ContainerPluginsRegistry.PluginRegistryListener() {
+  @Override
+  public void added(ContainerPluginsRegistry.ApiInfo plugin) {
+if (plugin == null || plugin.getInstance() == null) {
+  return;
+}
+Object instance = plugin.getInstance();
+if (instance instanceof ClusterEventListener) {
+  registerListener((ClusterEventListener) instance);
+}
+  }
+
+  @Override
+  public void deleted(ContainerPluginsRegistry.ApiInfo plugin) {
+if (plugin == null || plugin.getInstance() == null) {
+  return;
+}
+Object instance = plugin.getInstance();
+if (instance instanceof ClusterEventListener) {
+  unregisterListener((ClusterEventListener) instance);
+}
+  }
+
+  @Override
+  public void modified(ContainerPluginsRegistry.ApiInfo old, 
ContainerPluginsRegistry.ApiInfo replacement) {
+added(replacement);
+deleted(old);
+  }
+};
+  }
+
+  @Override
+  public Set getSupportedEventTypes() {
+return NoOpProducer.ALL_EVENT_TYPES;
+  }
+
+  /**
+   * This method returns an initial plugin registry listener that helps to 
capture the
+   * freshly loaded listener plugins before the final cluster event producer 
is created.
+   * @return initial listener
+   */
+  public ContainerPluginsRegistry.PluginRegistryListener 
getPluginRegistryListener() {
+return initialPluginListener;
+  }
+
+  /**
+   * Create a {@link ClusterEventProducer} based on the current plugin 
configurations.
+   * NOTE: this method can only be called once because it has side-effects, 
such as
+   * transferring the initially collected listeners to the resulting 
producer's instance, and
+   * installing a {@link 
org.apache.solr.api.ContainerPluginsRegistry.PluginRegistryListener}.
+   * Calling this method more than once will result in an exception.
+   * @param plugins current plugin configurations
+   * @return configured instance of cluster event producer (with side-effects, 
see above)
+   */
+  public DelegatingClusterEventProducer create(ContainerPluginsRegistry 
plugins) {
+if (created) {
+  throw new RuntimeException("this factory can be called only once!");
+}
+final DelegatingClusterEventProducer clusterEventProducer = new 
DelegatingClusterEventProducer(cc);
+// since this is a ClusterSingleton, register it as such
+
cc.getClusterSingletons().getSingletons().put(ClusterEventProducer.PLUGIN_NAME 
+"_delegate", clusterEventProducer);
+ContainerPluginsRegistry.ApiInfo clusterEventProducerInfo = 
plugins.getPlugin(ClusterEventProducer.PLUGIN_NAME);
+if (clusterEventProducerInfo != null) {
+  // the listener in ClusterSingletons already registered it
+  clusterEventProducer.setDelegate((ClusterEventProducer) 
clusterEventProducerInfo.getInstance());
+} else {
+  // use the default NoOp impl
+}
+// transfer those listeners that were already registered to the initial 
impl
+transferListeners(clusterEventProducer, plugins);

Review comment:
   Any potential for race conditions here? Can events be produced while 
this code executes?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-

[GitHub] [lucene-solr] murblanc commented on a change in pull request #1962: SOLR-14749 Provide a clean API for cluster-level event processing

2020-11-04 Thread GitBox



murblanc commented on a change in pull request #1962:
URL: https://github.com/apache/lucene-solr/pull/1962#discussion_r517213017



##
File path: 
solr/core/src/java/org/apache/solr/cluster/events/impl/ClusterEventProducerFactory.java
##
@@ -0,0 +1,173 @@
+package org.apache.solr.cluster.events.impl;
+
+import org.apache.solr.api.ContainerPluginsRegistry;
+import org.apache.solr.cluster.events.ClusterEvent;
+import org.apache.solr.cluster.events.ClusterEventListener;
+import org.apache.solr.cluster.events.ClusterEventProducer;
+import org.apache.solr.cluster.events.ClusterEventProducerBase;
+import org.apache.solr.cluster.events.NoOpProducer;
+import org.apache.solr.core.CoreContainer;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+import java.util.Set;
+
+/**
+ * This class helps in handling the initial registration of plugin-based 
listeners,
+ * when both the final {@link ClusterEventProducer} implementation and 
listeners
+ * are configured using plugins.
+ */
+public class ClusterEventProducerFactory extends ClusterEventProducerBase {

Review comment:
   This class is only used by a single thread, right?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-8673) o.a.s.search.facet classes not public/extendable

2020-11-04 Thread Tim Owen (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17225955#comment-17225955
 ] 

Tim Owen commented on SOLR-8673:


Hi Christine! Yes, it is FacetContext.. I am going to do that today, in our 
local build of Solr I'll add get methods, and then continue with my custom 
plugin aggregate to check it works. Then I'll attach a patch here.

Currently we have our custom aggregates added into the solr build locally, but 
I'm really keen to move it out into a plugin jar so we don't need to keep local 
patching.

Most built-in aggregates and facet code uses the FacetContext fields by relying 
on package-visibility for things like fcontext.req and fcontext.searcher but I 
might as well make them all gettable.

> o.a.s.search.facet classes not public/extendable
> 
>
> Key: SOLR-8673
> URL: https://issues.apache.org/jira/browse/SOLR-8673
> Project: Solr
>  Issue Type: Improvement
>  Components: Facet Module
>Affects Versions: 5.4.1
>Reporter: Markus Jelsma
>Priority: Major
> Fix For: 6.2, 7.0
>
>
> It is not easy to create a custom JSON facet function. A simple function 
> based on AvgAgg quickly results in the following compilation failures:
> {code}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.3:compile (default-compile) 
> on project openindex-solr: Compilation failure: Compilation failure:
> [ERROR] 
> /home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[22,36]
>  org.apache.solr.search.facet.FacetContext is not public in 
> org.apache.solr.search.facet; cannot be accessed from outside package
> [ERROR] 
> /home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[23,36]
>  org.apache.solr.search.facet.FacetDoubleMerger is not public in 
> org.apache.solr.search.facet; cannot be accessed from outside package
> [ERROR] 
> /home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[40,32]
>  cannot find symbol
> [ERROR] symbol:   class FacetContext
> [ERROR] location: class i.o.s.search.facet.CustomAvgAgg
> [ERROR] 
> /home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[49,39]
>  cannot find symbol
> [ERROR] symbol:   class FacetDoubleMerger
> [ERROR] location: class i.o.s.search.facet.CustomAvgAgg
> [ERROR] 
> /home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[54,43]
>  cannot find symbol
> [ERROR] symbol:   class Context
> [ERROR] location: class i.o.s.search.facet.CustomAvgAgg.Merger
> [ERROR] 
> /home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[41,16]
>  cannot find symbol
> [ERROR] symbol:   class AvgSlotAcc
> [ERROR] location: class i.o.s.search.facet.CustomAvgAgg
> [ERROR] 
> /home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[46,12]
>  incompatible types: i.o.s.search.facet.CustomAvgAgg.Merger cannot be 
> converted to org.apache.solr.search.facet.FacetMerger
> [ERROR] 
> /home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[53,5]
>  method does not override or implement a method from a supertype
> [ERROR] 
> /home/markus/projects/openindex/solr/trunk/src/main/java/i.o.s.search/facet/CustomAvgAgg.java:[60,5]
>  method does not override or implement a method from a supertype
> {code}
> It seems lots of classes are tucked away in FacetModule, which we can't reach 
> from outside.
> Originates from this thread: 
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201602.mbox/%3ccab_8yd9ldbg_0zxm_h1igkfm6bqeypd5ilyy7tty8cztscv...@mail.gmail.com%3E
>  ( also available at 
> https://lists.apache.org/thread.html/9fddcad3136ec908ce1c57881f8d3069e5d153f08b71f80f3e18d995%401455019826%40%3Csolr-user.lucene.apache.org%3E
>  )



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] murblanc commented on a change in pull request #1962: SOLR-14749 Provide a clean API for cluster-level event processing

2020-11-04 Thread GitBox



murblanc commented on a change in pull request #1962:
URL: https://github.com/apache/lucene-solr/pull/1962#discussion_r517218843



##
File path: 
solr/core/src/java/org/apache/solr/cluster/events/impl/ClusterEventProducerFactory.java
##
@@ -0,0 +1,173 @@
+package org.apache.solr.cluster.events.impl;
+
+import org.apache.solr.api.ContainerPluginsRegistry;
+import org.apache.solr.cluster.events.ClusterEvent;
+import org.apache.solr.cluster.events.ClusterEventListener;
+import org.apache.solr.cluster.events.ClusterEventProducer;
+import org.apache.solr.cluster.events.ClusterEventProducerBase;
+import org.apache.solr.cluster.events.NoOpProducer;
+import org.apache.solr.core.CoreContainer;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+import java.util.Set;
+
+/**
+ * This class helps in handling the initial registration of plugin-based 
listeners,
+ * when both the final {@link ClusterEventProducer} implementation and 
listeners
+ * are configured using plugins.
+ */
+public class ClusterEventProducerFactory extends ClusterEventProducerBase {
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  private ContainerPluginsRegistry.PluginRegistryListener 
initialPluginListener;
+  private boolean created = false;
+
+  public ClusterEventProducerFactory(CoreContainer cc) {
+super(cc);
+initialPluginListener = new 
ContainerPluginsRegistry.PluginRegistryListener() {
+  @Override
+  public void added(ContainerPluginsRegistry.ApiInfo plugin) {
+if (plugin == null || plugin.getInstance() == null) {
+  return;
+}
+Object instance = plugin.getInstance();
+if (instance instanceof ClusterEventListener) {
+  registerListener((ClusterEventListener) instance);
+}
+  }
+
+  @Override
+  public void deleted(ContainerPluginsRegistry.ApiInfo plugin) {
+if (plugin == null || plugin.getInstance() == null) {
+  return;
+}
+Object instance = plugin.getInstance();
+if (instance instanceof ClusterEventListener) {
+  unregisterListener((ClusterEventListener) instance);
+}
+  }
+
+  @Override
+  public void modified(ContainerPluginsRegistry.ApiInfo old, 
ContainerPluginsRegistry.ApiInfo replacement) {
+added(replacement);
+deleted(old);
+  }
+};
+  }
+
+  @Override
+  public Set getSupportedEventTypes() {
+return NoOpProducer.ALL_EVENT_TYPES;
+  }
+
+  /**
+   * This method returns an initial plugin registry listener that helps to 
capture the
+   * freshly loaded listener plugins before the final cluster event producer 
is created.
+   * @return initial listener
+   */
+  public ContainerPluginsRegistry.PluginRegistryListener 
getPluginRegistryListener() {
+return initialPluginListener;
+  }
+
+  /**
+   * Create a {@link ClusterEventProducer} based on the current plugin 
configurations.
+   * NOTE: this method can only be called once because it has side-effects, 
such as
+   * transferring the initially collected listeners to the resulting 
producer's instance, and
+   * installing a {@link 
org.apache.solr.api.ContainerPluginsRegistry.PluginRegistryListener}.
+   * Calling this method more than once will result in an exception.
+   * @param plugins current plugin configurations
+   * @return configured instance of cluster event producer (with side-effects, 
see above)
+   */
+  public DelegatingClusterEventProducer create(ContainerPluginsRegistry 
plugins) {
+if (created) {
+  throw new RuntimeException("this factory can be called only once!");
+}
+final DelegatingClusterEventProducer clusterEventProducer = new 
DelegatingClusterEventProducer(cc);
+// since this is a ClusterSingleton, register it as such
+
cc.getClusterSingletons().getSingletons().put(ClusterEventProducer.PLUGIN_NAME 
+"_delegate", clusterEventProducer);
+ContainerPluginsRegistry.ApiInfo clusterEventProducerInfo = 
plugins.getPlugin(ClusterEventProducer.PLUGIN_NAME);
+if (clusterEventProducerInfo != null) {
+  // the listener in ClusterSingletons already registered it
+  clusterEventProducer.setDelegate((ClusterEventProducer) 
clusterEventProducerInfo.getInstance());
+} else {
+  // use the default NoOp impl
+}
+// transfer those listeners that were already registered to the initial 
impl
+transferListeners(clusterEventProducer, plugins);
+// install plugin registry listener
+ContainerPluginsRegistry.PluginRegistryListener pluginListener = new 
ContainerPluginsRegistry.PluginRegistryListener() {
+  @Override
+  public void added(ContainerPluginsRegistry.ApiInfo plugin) {
+if (plugin == null || plugin.getInstance() == null) {
+  return;
+}
+Object instance = plugin.getInstance();
+if (instance instanceof ClusterEventListener) {
+  ClusterEvent

[GitHub] [lucene-solr] sigram commented on a change in pull request #1962: SOLR-14749 Provide a clean API for cluster-level event processing

2020-11-04 Thread GitBox



sigram commented on a change in pull request #1962:
URL: https://github.com/apache/lucene-solr/pull/1962#discussion_r517287374



##
File path: 
solr/core/src/java/org/apache/solr/cluster/events/ClusterEventProducerBase.java
##
@@ -0,0 +1,85 @@
+package org.apache.solr.cluster.events;
+
+import org.apache.solr.core.CoreContainer;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+import java.util.Collections;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.ConcurrentHashMap;
+
+/**
+ *
+ */
+public abstract class ClusterEventProducerBase implements ClusterEventProducer 
{
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  protected final Map> 
listeners = new ConcurrentHashMap<>();
+  protected volatile State state = State.STOPPED;
+  protected final CoreContainer cc;
+
+  protected ClusterEventProducerBase(CoreContainer cc) {
+this.cc = cc;
+  }
+
+  @Override
+  public void registerListener(ClusterEventListener listener, 
ClusterEvent.EventType... eventTypes) {
+if (eventTypes == null || eventTypes.length == 0) {
+  eventTypes = ClusterEvent.EventType.values();
+}
+for (ClusterEvent.EventType type : eventTypes) {
+  if (!getSupportedEventTypes().contains(type)) {
+log.warn("event type {} not supported yet.", type);
+continue;
+  }
+  // to avoid removing no-longer empty set in unregister
+  synchronized (listeners) {
+listeners.computeIfAbsent(type, t -> ConcurrentHashMap.newKeySet())
+.add(listener);
+  }
+}
+  }
+
+  @Override
+  public void unregisterListener(ClusterEventListener listener, 
ClusterEvent.EventType... eventTypes) {
+if (eventTypes == null || eventTypes.length == 0) {
+  eventTypes = ClusterEvent.EventType.values();
+}
+synchronized (listeners) {
+  for (ClusterEvent.EventType type : eventTypes) {
+Set perType = listeners.get(type);
+if (perType != null) {
+  perType.remove(listener);
+  if (perType.isEmpty()) {
+listeners.remove(type);
+  }
+}
+  }
+}
+  }
+
+  @Override
+  public State getState() {
+return state;
+  }
+
+  public Map> 
getEventListeners() {
+return listeners;
+  }
+
+  public CoreContainer getCoreContainer() {
+return cc;
+  }
+
+  public abstract Set getSupportedEventTypes();
+
+  protected void fireEvent(ClusterEvent event) {
+listeners.getOrDefault(event.getType(), Collections.emptySet())

Review comment:
   Good point, muse-dev :) Fixing.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14977) Container plugins need a way to be configured

2020-11-04 Thread Noble Paul (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-14977:
--
Attachment: SOLR-14977.patch

> Container plugins need a way to be configured
> -
>
> Key: SOLR-14977
> URL: https://issues.apache.org/jira/browse/SOLR-14977
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Plugin system
>Reporter: Andrzej Bialecki
>Priority: Major
> Attachments: SOLR-14977.patch
>
>
> Container plugins are defined in {{/clusterprops.json:/plugin}} using a 
> simple {{PluginMeta}} bean. This is sufficient for implementations that don't 
> need any configuration except for the {{pathPrefix}} but insufficient for 
> anything else that needs more configuration parameters.
> An example would be a {{CollectionsRepairEventListener}} plugin proposed in 
> PR-1962, which needs parameters such as the list of collections, {{waitFor}}, 
> maximum operations allowed, etc. to properly function.
> This issue proposes to extend the {{PluginMeta}} bean to allow a 
> {{Map}} configuration parameters.
> There is an interface that we could potentially use ({{MapInitializedPlugin}} 
> but it works only with {{String}} values. This is not optimal because it 
> requires additional type-safety validation from the consumers. The existing 
> {{PluginInfo}} / {{PluginInfoInitialized}} interface is too complex for this 
> purpose.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] sigram commented on a change in pull request #1962: SOLR-14749 Provide a clean API for cluster-level event processing

2020-11-04 Thread GitBox



sigram commented on a change in pull request #1962:
URL: https://github.com/apache/lucene-solr/pull/1962#discussion_r517290048



##
File path: 
solr/core/src/java/org/apache/solr/cluster/events/impl/ClusterEventProducerFactory.java
##
@@ -0,0 +1,173 @@
+package org.apache.solr.cluster.events.impl;
+
+import org.apache.solr.api.ContainerPluginsRegistry;
+import org.apache.solr.cluster.events.ClusterEvent;
+import org.apache.solr.cluster.events.ClusterEventListener;
+import org.apache.solr.cluster.events.ClusterEventProducer;
+import org.apache.solr.cluster.events.ClusterEventProducerBase;
+import org.apache.solr.cluster.events.NoOpProducer;
+import org.apache.solr.core.CoreContainer;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+import java.util.Set;
+
+/**
+ * This class helps in handling the initial registration of plugin-based 
listeners,
+ * when both the final {@link ClusterEventProducer} implementation and 
listeners
+ * are configured using plugins.
+ */
+public class ClusterEventProducerFactory extends ClusterEventProducerBase {

Review comment:
   Yes.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] sigram commented on a change in pull request #1962: SOLR-14749 Provide a clean API for cluster-level event processing

2020-11-04 Thread GitBox



sigram commented on a change in pull request #1962:
URL: https://github.com/apache/lucene-solr/pull/1962#discussion_r517293581



##
File path: 
solr/core/src/java/org/apache/solr/cluster/events/impl/ClusterEventProducerFactory.java
##
@@ -0,0 +1,173 @@
+package org.apache.solr.cluster.events.impl;
+
+import org.apache.solr.api.ContainerPluginsRegistry;
+import org.apache.solr.cluster.events.ClusterEvent;
+import org.apache.solr.cluster.events.ClusterEventListener;
+import org.apache.solr.cluster.events.ClusterEventProducer;
+import org.apache.solr.cluster.events.ClusterEventProducerBase;
+import org.apache.solr.cluster.events.NoOpProducer;
+import org.apache.solr.core.CoreContainer;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.lang.invoke.MethodHandles;
+import java.util.Set;
+
+/**
+ * This class helps in handling the initial registration of plugin-based 
listeners,
+ * when both the final {@link ClusterEventProducer} implementation and 
listeners
+ * are configured using plugins.
+ */
+public class ClusterEventProducerFactory extends ClusterEventProducerBase {
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  private ContainerPluginsRegistry.PluginRegistryListener 
initialPluginListener;
+  private boolean created = false;
+
+  public ClusterEventProducerFactory(CoreContainer cc) {
+super(cc);
+initialPluginListener = new 
ContainerPluginsRegistry.PluginRegistryListener() {
+  @Override
+  public void added(ContainerPluginsRegistry.ApiInfo plugin) {
+if (plugin == null || plugin.getInstance() == null) {
+  return;
+}
+Object instance = plugin.getInstance();
+if (instance instanceof ClusterEventListener) {
+  registerListener((ClusterEventListener) instance);
+}
+  }
+
+  @Override
+  public void deleted(ContainerPluginsRegistry.ApiInfo plugin) {
+if (plugin == null || plugin.getInstance() == null) {
+  return;
+}
+Object instance = plugin.getInstance();
+if (instance instanceof ClusterEventListener) {
+  unregisterListener((ClusterEventListener) instance);
+}
+  }
+
+  @Override
+  public void modified(ContainerPluginsRegistry.ApiInfo old, 
ContainerPluginsRegistry.ApiInfo replacement) {
+added(replacement);
+deleted(old);
+  }
+};
+  }
+
+  @Override
+  public Set getSupportedEventTypes() {
+return NoOpProducer.ALL_EVENT_TYPES;
+  }
+
+  /**
+   * This method returns an initial plugin registry listener that helps to 
capture the
+   * freshly loaded listener plugins before the final cluster event producer 
is created.
+   * @return initial listener
+   */
+  public ContainerPluginsRegistry.PluginRegistryListener 
getPluginRegistryListener() {
+return initialPluginListener;
+  }
+
+  /**
+   * Create a {@link ClusterEventProducer} based on the current plugin 
configurations.
+   * NOTE: this method can only be called once because it has side-effects, 
such as
+   * transferring the initially collected listeners to the resulting 
producer's instance, and
+   * installing a {@link 
org.apache.solr.api.ContainerPluginsRegistry.PluginRegistryListener}.
+   * Calling this method more than once will result in an exception.
+   * @param plugins current plugin configurations
+   * @return configured instance of cluster event producer (with side-effects, 
see above)
+   */
+  public DelegatingClusterEventProducer create(ContainerPluginsRegistry 
plugins) {
+if (created) {
+  throw new RuntimeException("this factory can be called only once!");
+}
+final DelegatingClusterEventProducer clusterEventProducer = new 
DelegatingClusterEventProducer(cc);
+// since this is a ClusterSingleton, register it as such
+
cc.getClusterSingletons().getSingletons().put(ClusterEventProducer.PLUGIN_NAME 
+"_delegate", clusterEventProducer);
+ContainerPluginsRegistry.ApiInfo clusterEventProducerInfo = 
plugins.getPlugin(ClusterEventProducer.PLUGIN_NAME);
+if (clusterEventProducerInfo != null) {
+  // the listener in ClusterSingletons already registered it
+  clusterEventProducer.setDelegate((ClusterEventProducer) 
clusterEventProducerInfo.getInstance());
+} else {
+  // use the default NoOp impl
+}
+// transfer those listeners that were already registered to the initial 
impl
+transferListeners(clusterEventProducer, plugins);

Review comment:
   No. At this point in the initialization of CoreContainer any source of 
events will be ignored anyway (because the factory is basically a no-op 
implementation that doesn't generate any events), and listeners are being 
transferred before the final implementation is installed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitH

[jira] [Commented] (SOLR-14067) Move StatelessScriptUpdateProcessor to a contrib

2020-11-04 Thread David Eric Pugh (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226038#comment-17226038
 ] 

David Eric Pugh commented on SOLR-14067:


I am going to try and pick this up again today and make the naming change and 
remove the "@Deprecation", and hopefully have it ready for 8.7.

> Move StatelessScriptUpdateProcessor to a contrib
> 
>
> Key: SOLR-14067
> URL: https://issues.apache.org/jira/browse/SOLR-14067
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ishan Chattopadhyaya
>Assignee: David Eric Pugh
>Priority: Major
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Move server-side scripting out of core and into a new contrib.  This is 
> better for security.
> Former description:
> 
> We should eliminate all scripting capabilities within Solr. Let us start with 
> the StatelessScriptUpdateProcessor deprecation/removal.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] janhoy opened a new pull request #2062: LUCENE-9589 Swedish minimal stemmer

2020-11-04 Thread GitBox



janhoy opened a new pull request #2062:
URL: https://github.com/apache/lucene-solr/pull/2062


   See https://issues.apache.org/jira/browse/LUCENE-9589
   
   This impl is based on 
`[SwedishLightStemmer](https://github.com/apache/lucene-solr/blob/master/lucene/analysis/common/src/java/org/apache/lucene/analysis/sv/SwedishLightStemmer.java)`,
 concentrating on the plural endings only, inspired by 
`[NorwegianMinimalStemmer](https://github.com/apache/lucene-solr/blob/master/lucene/analysis/common/src/java/org/apache/lucene/analysis/no/NorwegianMinimalStemmer.java)`.
   
   Some of the examples tested in `minimal.txt` are fetched from 
https://en.wikipedia.org/wiki/Swedish_grammar, not any scientific rule book of 
any kind.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9589) Swedish Minimal Stemmer

2020-11-04 Thread Jira



[ 
https://issues.apache.org/jira/browse/LUCENE-9589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226045#comment-17226045
 ] 

Jan Høydahl commented on LUCENE-9589:
-

Pull request here https://github.com/apache/lucene-solr/pull/2062

> Swedish Minimal Stemmer
> ---
>
> Key: LUCENE-9589
> URL: https://issues.apache.org/jira/browse/LUCENE-9589
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Swedish has a {{SwedishLightStemmer}} but lacks a Minimal stemmer that would 
> only stem singular/plural.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-9589) Swedish Minimal Stemmer

2020-11-04 Thread Jira



 [ 
https://issues.apache.org/jira/browse/LUCENE-9589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated LUCENE-9589:

Status: Patch Available  (was: Open)

> Swedish Minimal Stemmer
> ---
>
> Key: LUCENE-9589
> URL: https://issues.apache.org/jira/browse/LUCENE-9589
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Swedish has a {{SwedishLightStemmer}} but lacks a Minimal stemmer that would 
> only stem singular/plural.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14067) Move StatelessScriptUpdateProcessor to a contrib

2020-11-04 Thread David Smiley (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226053#comment-17226053
 ] 

David Smiley commented on SOLR-14067:
-

It'll make it into 8.8.  8.7 was just released; in-progress.

> Move StatelessScriptUpdateProcessor to a contrib
> 
>
> Key: SOLR-14067
> URL: https://issues.apache.org/jira/browse/SOLR-14067
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ishan Chattopadhyaya
>Assignee: David Eric Pugh
>Priority: Major
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Move server-side scripting out of core and into a new contrib.  This is 
> better for security.
> Former description:
> 
> We should eliminate all scripting capabilities within Solr. Let us start with 
> the StatelessScriptUpdateProcessor deprecation/removal.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-14984) Solr standalone core not used as collection in authorization

2020-11-04 Thread Dr. Daniel Georg Kirschner (Jira)

Dr. Daniel Georg Kirschner created SOLR-14984:
-

 Summary: Solr standalone core not used as collection in 
authorization
 Key: SOLR-14984
 URL: https://issues.apache.org/jira/browse/SOLR-14984
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: Authorization
Affects Versions: 8.6.3
 Environment: Solr 8.6.3 (with techproducts sample as "tech").
Reporter: Dr. Daniel Georg Kirschner


In org.apache.solr.servlet.HttpSolrCall method AuthorizationContext 
getAuthCtx() seems not to use the core in the collectionRequests which leads to 
org.apache.solr.security.RuleBasedAuthorizationPluginBase method authorize() 
not using the core in the authorization rules. IMHO, this seems not to be what 
is intened security wise.

 

My use case seems to be solved by changing (in HttpSolrCall.getAuthCtx() ):

 

Org:

SolrParams params = getQueryParams();
final ArrayList collectionRequests = new ArrayList<>();
for (String collection : getCollectionsList()) {
 collectionRequests.add(new CollectionRequest(collection));
}

 

To New:

 

SolrParams params = getQueryParams();
final ArrayList collectionRequests = new ArrayList<>();
for (String collection : getCollectionsList()) {
 collectionRequests.add(new CollectionRequest(collection));
}
*if (core != null) {*
 *collectionRequests.add(new CollectionRequest(core.getName()));*
*}*

 

I do not understand the full concept of the authorization code. Please check if 
this quick fix is actually working for all use cases.

 

Best regards,

 

Daniel Kirschner



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] muse-dev[bot] commented on a change in pull request #1962: SOLR-14749 Provide a clean API for cluster-level event processing

2020-11-04 Thread GitBox



muse-dev[bot] commented on a change in pull request #1962:
URL: https://github.com/apache/lucene-solr/pull/1962#discussion_r517470119



##
File path: solr/core/src/java/org/apache/solr/api/ContainerPluginsRegistry.java
##
@@ -100,6 +101,15 @@ public void writeMap(EntryWriter ew) throws IOException {
 currentPlugins.forEach(ew.getBiConsumer());
   }
 
+  @Override
+  public void close() throws IOException {
+currentPlugins.values().forEach(apiInfo -> {

Review comment:
   *THREAD_SAFETY_VIOLATION:*  Read/Write race. Non-private method 
`ContainerPluginsRegistry.close()` reads without synchronization from container 
`this.currentPlugins` via call to `Map.values()`. Potentially races with write 
in method `ContainerPluginsRegistry.refresh()`.
Reporting because another access to the same memory occurs on a background 
thread, although this access may not.

##
File path: solr/core/src/java/org/apache/solr/api/ContainerPluginsRegistry.java
##
@@ -100,6 +101,15 @@ public void writeMap(EntryWriter ew) throws IOException {
 currentPlugins.forEach(ew.getBiConsumer());

Review comment:
   *THREAD_SAFETY_VIOLATION:*  Read/Write race. Non-private method 
`ContainerPluginsRegistry.writeMap(...)` reads without synchronization from 
container `this.currentPlugins` via call to `Map.forEach(...)`. Potentially 
races with write in method `ContainerPluginsRegistry.refresh()`.
Reporting because another access to the same memory occurs on a background 
thread, although this access may not.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14788) Solr: The Next Big Thing

2020-11-04 Thread Mark Robert Miller (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226270#comment-17226270
 ] 

Mark Robert Miller commented on SOLR-14788:
---

I should also mention, for those thinking about and looking at removing the 
Overseer entirely - like almost everything that’s been done here 
, you will find that much easier to do with these changes rather than harder. 

> Solr: The Next Big Thing
> 
>
> Key: SOLR-14788
> URL: https://issues.apache.org/jira/browse/SOLR-14788
> Project: Solr
>  Issue Type: Task
>Reporter: Mark Robert Miller
>Assignee: Mark Robert Miller
>Priority: Critical
>
> h3. 
> [!https://www.unicode.org/consortium/aacimg/1F46E.png!|https://www.unicode.org/consortium/adopted-characters.html#b1F46E]{color:#00875a}*The
>  Policeman is on duty!*{color}
> {quote}_{color:#de350b}*When The Policeman is on duty, sit back, relax, and 
> have some fun. Try to make some progress. Don't stress too much about the 
> impact of your changes or maintaining stability and performance and 
> correctness so much. Until the end of phase 1, I've got your back. I have a 
> variety of tools and contraptions I have been building over the years and I 
> will continue training them on this branch. I will review your changes and 
> peer out across the land and course correct where needed. As Mike D will be 
> thinking, "Sounds like a bottleneck Mark." And indeed it will be to some 
> extent. Which is why once stage one is completed, I will flip The Policeman 
> to off duty. When off duty, I'm always* {color:#de350b}*occasionally*{color} 
> *down for some vigilante justice, but I won't be walking the beat, all that 
> stuff about sit back and relax goes out the window.*{color}_
> {quote}
>  
> I have stolen this title from Ishan or Noble and Ishan.
> This issue is meant to capture the work of a small team that is forming to 
> push Solr and SolrCloud to the next phase.
> I have kicked off the work with an effort to create a very fast and solid 
> base. That work is not 100% done, but it's ready to join the fight.
> Tim Potter has started giving me a tremendous hand in finishing up. Ishan and 
> Noble have already contributed support and testing and have plans for 
> additional work to shore up some of our current shortcomings.
> Others have expressed an interest in helping and hopefully they will pop up 
> here as well.
> Let's organize and discuss our efforts here and in various sub issues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14977) Container plugins need a way to be configured

2020-11-04 Thread Ilan Ginzburg (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226287#comment-17226287
 ] 

Ilan Ginzburg commented on SOLR-14977:
--

Placement plugin config (names, types and values) are decided by the plugin 
writer, so can't be hard coded in a Solr class.

A plugin might use configuration named {{myfirstString}}, {{aLong}}, 
{{aDoubleConfig}} and {{shouldIStay}} (poor name choices if you ask me).
 A more realistic plugin would go for {{minimalFreeDiskGB}} and 
{{deprioritizedFreeDiskGB}} (see 
[SamplePluginAffinityReplicaPlacement|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/cluster/placement/plugins/SamplePluginAffinityReplicaPlacement.java#L86]).

> Container plugins need a way to be configured
> -
>
> Key: SOLR-14977
> URL: https://issues.apache.org/jira/browse/SOLR-14977
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Plugin system
>Reporter: Andrzej Bialecki
>Priority: Major
> Attachments: SOLR-14977.patch
>
>
> Container plugins are defined in {{/clusterprops.json:/plugin}} using a 
> simple {{PluginMeta}} bean. This is sufficient for implementations that don't 
> need any configuration except for the {{pathPrefix}} but insufficient for 
> anything else that needs more configuration parameters.
> An example would be a {{CollectionsRepairEventListener}} plugin proposed in 
> PR-1962, which needs parameters such as the list of collections, {{waitFor}}, 
> maximum operations allowed, etc. to properly function.
> This issue proposes to extend the {{PluginMeta}} bean to allow a 
> {{Map}} configuration parameters.
> There is an interface that we could potentially use ({{MapInitializedPlugin}} 
> but it works only with {{String}} values. This is not optimal because it 
> requires additional type-safety validation from the consumers. The existing 
> {{PluginInfo}} / {{PluginInfoInitialized}} interface is too complex for this 
> purpose.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] cpoerschke commented on a change in pull request #1571: SOLR-14560: Interleaving for Learning To Rank

2020-11-04 Thread GitBox



cpoerschke commented on a change in pull request #1571:
URL: https://github.com/apache/lucene-solr/pull/1571#discussion_r517502678



##
File path: 
solr/contrib/ltr/src/java/org/apache/solr/ltr/search/LTRQParserPlugin.java
##
@@ -146,93 +149,114 @@ public LTRQParser(String qstr, SolrParams localParams, 
SolrParams params,
 @Override
 public Query parse() throws SyntaxError {
   // ReRanking Model
-  final String modelName = localParams.get(LTRQParserPlugin.MODEL);
-  if ((modelName == null) || modelName.isEmpty()) {
+  final String[] modelNames = 
localParams.getParams(LTRQParserPlugin.MODEL);
+  if ((modelNames == null) || modelNames.length==0 || 
modelNames[0].isEmpty()) {
 throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
 "Must provide model in the request");
   }
-
-  final LTRScoringModel ltrScoringModel = mr.getModel(modelName);
-  if (ltrScoringModel == null) {
-throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
-"cannot find " + LTRQParserPlugin.MODEL + " " + modelName);
-  }
-
-  final String modelFeatureStoreName = 
ltrScoringModel.getFeatureStoreName();
-  final boolean extractFeatures = 
SolrQueryRequestContextUtils.isExtractingFeatures(req);
-  final String fvStoreName = 
SolrQueryRequestContextUtils.getFvStoreName(req);
-  // Check if features are requested and if the model feature store and 
feature-transform feature store are the same
-  final boolean featuresRequestedFromSameStore = 
(modelFeatureStoreName.equals(fvStoreName) || fvStoreName == null) ? 
extractFeatures:false;
-  if (threadManager != null) {
-
threadManager.setExecutor(req.getCore().getCoreContainer().getUpdateShardHandler().getUpdateExecutor());
-  }
-  final LTRScoringQuery scoringQuery = new LTRScoringQuery(ltrScoringModel,
-  extractEFIParams(localParams),
-  featuresRequestedFromSameStore, threadManager);
-
-  // Enable the feature vector caching if we are extracting features, and 
the features
-  // we requested are the same ones we are reranking with
-  if (featuresRequestedFromSameStore) {
-scoringQuery.setFeatureLogger( 
SolrQueryRequestContextUtils.getFeatureLogger(req) );
+ 
+  LTRScoringQuery[] rerankingQueries = new 
LTRScoringQuery[modelNames.length];
+  for (int i = 0; i < modelNames.length; i++) {
+final LTRScoringQuery rerankingQuery;
+if (!ORIGINAL_RANKING.equals(modelNames[i])) {

Review comment:
   Hi @alessandrobenedetti! Apologies for not returning to here sooner.
   
   Good point about Apache Solr already using "special names" in various 
places, sure, let's go with "_OriginalRanking_" as the special name then. Hmm, 
the UI treats underscore special here, so `"_OriginalRanking_"` is what's in 
the code currently.
   
   > ... we could move the review to the re-scoring part, that is more delicate 
...
   
   "delicate" is an excellent word choice, thank you, I spent some pleasant 
time today continuing to learn more about the re-scoring part. Took a sort of 
outside-to-inside approach i.e. starting at `LTRQParserPlugin` and then looking 
at what changed and how the bits fit together.
   
   If it's okay with you I'll push a branch to my fork and add 
comments/questions sometimes with commit links here, not quite sure if that 
might work, let's try?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] cpoerschke commented on a change in pull request #1571: SOLR-14560: Interleaving for Learning To Rank

2020-11-04 Thread GitBox



cpoerschke commented on a change in pull request #1571:
URL: https://github.com/apache/lucene-solr/pull/1571#discussion_r517514900



##
File path: solr/contrib/ltr/src/java/org/apache/solr/ltr/LTRScoringQuery.java
##
@@ -73,6 +74,8 @@
   final private Map efi;
   // Original solr query used to fetch matching documents
   private Query originalQuery;
+  // Model was picked for this Docs
+  private Set pickedInterleavingDocIds;

Review comment:
   4.1/n The addition of a new member here which is only sometimes 
applicable got me wondering about possibly having `LTRInterleavingScoringQuery 
extends LTRScoringQuery` inheritance.

##
File path: 
solr/contrib/ltr/src/java/org/apache/solr/ltr/search/LTRQParserPlugin.java
##
@@ -146,93 +149,114 @@ public LTRQParser(String qstr, SolrParams localParams, 
SolrParams params,
 @Override
 public Query parse() throws SyntaxError {
   // ReRanking Model
-  final String modelName = localParams.get(LTRQParserPlugin.MODEL);
-  if ((modelName == null) || modelName.isEmpty()) {
+  final String[] modelNames = 
localParams.getParams(LTRQParserPlugin.MODEL);
+  if ((modelNames == null) || modelNames.length==0 || 
modelNames[0].isEmpty()) {
 throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
 "Must provide model in the request");
   }
-
-  final LTRScoringModel ltrScoringModel = mr.getModel(modelName);
-  if (ltrScoringModel == null) {
-throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
-"cannot find " + LTRQParserPlugin.MODEL + " " + modelName);
-  }
-
-  final String modelFeatureStoreName = 
ltrScoringModel.getFeatureStoreName();
-  final boolean extractFeatures = 
SolrQueryRequestContextUtils.isExtractingFeatures(req);
-  final String fvStoreName = 
SolrQueryRequestContextUtils.getFvStoreName(req);
-  // Check if features are requested and if the model feature store and 
feature-transform feature store are the same
-  final boolean featuresRequestedFromSameStore = 
(modelFeatureStoreName.equals(fvStoreName) || fvStoreName == null) ? 
extractFeatures:false;
-  if (threadManager != null) {
-
threadManager.setExecutor(req.getCore().getCoreContainer().getUpdateShardHandler().getUpdateExecutor());
-  }
-  final LTRScoringQuery scoringQuery = new LTRScoringQuery(ltrScoringModel,
-  extractEFIParams(localParams),
-  featuresRequestedFromSameStore, threadManager);
-
-  // Enable the feature vector caching if we are extracting features, and 
the features
-  // we requested are the same ones we are reranking with
-  if (featuresRequestedFromSameStore) {
-scoringQuery.setFeatureLogger( 
SolrQueryRequestContextUtils.getFeatureLogger(req) );
+ 
+  LTRScoringQuery[] rerankingQueries = new 
LTRScoringQuery[modelNames.length];
+  for (int i = 0; i < modelNames.length; i++) {
+final LTRScoringQuery rerankingQuery;
+if (!ORIGINAL_RANKING.equals(modelNames[i])) {
+  final LTRScoringModel ltrScoringModel = mr.getModel(modelNames[i]);
+  if (ltrScoringModel == null) {
+throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
+"cannot find " + LTRQParserPlugin.MODEL + " " + modelNames[i]);
+  }
+  final String modelFeatureStoreName = 
ltrScoringModel.getFeatureStoreName();
+  final boolean extractFeatures = 
SolrQueryRequestContextUtils.isExtractingFeatures(req);
+  final String fvStoreName = 
SolrQueryRequestContextUtils.getFvStoreName(req);// Check if features 
are requested and if the model feature store and feature-transform feature 
store are the same
+  final boolean featuresRequestedFromSameStore = 
(modelFeatureStoreName.equals(fvStoreName) || fvStoreName == null) ? 
extractFeatures : false;
+  if (threadManager != null) {
+
threadManager.setExecutor(req.getCore().getCoreContainer().getUpdateShardHandler().getUpdateExecutor());
+  }
+  rerankingQuery = new LTRScoringQuery(ltrScoringModel,
+  extractEFIParams(localParams),
+  featuresRequestedFromSameStore, threadManager);
+
+  // Enable the feature vector caching if we are extracting features, 
and the features
+  // we requested are the same ones we are reranking with
+  if (featuresRequestedFromSameStore) {
+rerankingQuery.setFeatureLogger( 
SolrQueryRequestContextUtils.getFeatureLogger(req) );
+  }
+}else{
+  rerankingQuery = new LTRScoringQuery(null);
+}
+
+// External features
+rerankingQuery.setRequest(req);
+rerankingQueries[i] = rerankingQuery;
   }
-  SolrQueryRequestContextUtils.setScoringQuery(req, scoringQuery);
 
+  SolrQueryRequestContextUtils.setScoringQuery(req, rerankingQueries);
   int reRankDocs = localParams.getInt(RERAN

[jira] [Commented] (SOLR-14588) Circuit Breakers Infrastructure and Real JVM Based Circuit Breaker

2020-11-04 Thread Otis Gospodnetic (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226325#comment-17226325
 ] 

Otis Gospodnetic commented on SOLR-14588:
-

Are these circuit breakers exposed as metrics so one can monitor and alert on 
them?

> Circuit Breakers Infrastructure and Real JVM Based Circuit Breaker
> --
>
> Key: SOLR-14588
> URL: https://issues.apache.org/jira/browse/SOLR-14588
> Project: Solr
>  Issue Type: Improvement
>Reporter: Atri Sharma
>Assignee: Atri Sharma
>Priority: Blocker
> Fix For: master (9.0), 8.7
>
>  Time Spent: 13h 50m
>  Remaining Estimate: 0h
>
> This Jira tracks addition of circuit breakers in the search path and 
> implements JVM based circuit breaker which rejects incoming search requests 
> if the JVM heap usage exceeds a defined percentage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] alessandrobenedetti commented on a change in pull request #1571: SOLR-14560: Interleaving for Learning To Rank

2020-11-04 Thread GitBox



alessandrobenedetti commented on a change in pull request #1571:
URL: https://github.com/apache/lucene-solr/pull/1571#discussion_r517551810



##
File path: 
solr/contrib/ltr/src/java/org/apache/solr/ltr/search/LTRQParserPlugin.java
##
@@ -146,93 +149,114 @@ public LTRQParser(String qstr, SolrParams localParams, 
SolrParams params,
 @Override
 public Query parse() throws SyntaxError {
   // ReRanking Model
-  final String modelName = localParams.get(LTRQParserPlugin.MODEL);
-  if ((modelName == null) || modelName.isEmpty()) {
+  final String[] modelNames = 
localParams.getParams(LTRQParserPlugin.MODEL);
+  if ((modelNames == null) || modelNames.length==0 || 
modelNames[0].isEmpty()) {
 throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
 "Must provide model in the request");
   }
-
-  final LTRScoringModel ltrScoringModel = mr.getModel(modelName);
-  if (ltrScoringModel == null) {
-throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
-"cannot find " + LTRQParserPlugin.MODEL + " " + modelName);
-  }
-
-  final String modelFeatureStoreName = 
ltrScoringModel.getFeatureStoreName();
-  final boolean extractFeatures = 
SolrQueryRequestContextUtils.isExtractingFeatures(req);
-  final String fvStoreName = 
SolrQueryRequestContextUtils.getFvStoreName(req);
-  // Check if features are requested and if the model feature store and 
feature-transform feature store are the same
-  final boolean featuresRequestedFromSameStore = 
(modelFeatureStoreName.equals(fvStoreName) || fvStoreName == null) ? 
extractFeatures:false;
-  if (threadManager != null) {
-
threadManager.setExecutor(req.getCore().getCoreContainer().getUpdateShardHandler().getUpdateExecutor());
-  }
-  final LTRScoringQuery scoringQuery = new LTRScoringQuery(ltrScoringModel,
-  extractEFIParams(localParams),
-  featuresRequestedFromSameStore, threadManager);
-
-  // Enable the feature vector caching if we are extracting features, and 
the features
-  // we requested are the same ones we are reranking with
-  if (featuresRequestedFromSameStore) {
-scoringQuery.setFeatureLogger( 
SolrQueryRequestContextUtils.getFeatureLogger(req) );
+ 
+  LTRScoringQuery[] rerankingQueries = new 
LTRScoringQuery[modelNames.length];
+  for (int i = 0; i < modelNames.length; i++) {
+final LTRScoringQuery rerankingQuery;
+if (!ORIGINAL_RANKING.equals(modelNames[i])) {

Review comment:
   Thanks @cpoerschke ! Tomorrow I'll address all your comments and think 
about them!
   My intent is to merge within few days, so hopefully we can close this soon!
   Thank you again for your time and cntribution, much appreciated!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] alessandrobenedetti commented on a change in pull request #1571: SOLR-14560: Interleaving for Learning To Rank

2020-11-04 Thread GitBox



alessandrobenedetti commented on a change in pull request #1571:
URL: https://github.com/apache/lucene-solr/pull/1571#discussion_r517551810



##
File path: 
solr/contrib/ltr/src/java/org/apache/solr/ltr/search/LTRQParserPlugin.java
##
@@ -146,93 +149,114 @@ public LTRQParser(String qstr, SolrParams localParams, 
SolrParams params,
 @Override
 public Query parse() throws SyntaxError {
   // ReRanking Model
-  final String modelName = localParams.get(LTRQParserPlugin.MODEL);
-  if ((modelName == null) || modelName.isEmpty()) {
+  final String[] modelNames = 
localParams.getParams(LTRQParserPlugin.MODEL);
+  if ((modelNames == null) || modelNames.length==0 || 
modelNames[0].isEmpty()) {
 throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
 "Must provide model in the request");
   }
-
-  final LTRScoringModel ltrScoringModel = mr.getModel(modelName);
-  if (ltrScoringModel == null) {
-throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
-"cannot find " + LTRQParserPlugin.MODEL + " " + modelName);
-  }
-
-  final String modelFeatureStoreName = 
ltrScoringModel.getFeatureStoreName();
-  final boolean extractFeatures = 
SolrQueryRequestContextUtils.isExtractingFeatures(req);
-  final String fvStoreName = 
SolrQueryRequestContextUtils.getFvStoreName(req);
-  // Check if features are requested and if the model feature store and 
feature-transform feature store are the same
-  final boolean featuresRequestedFromSameStore = 
(modelFeatureStoreName.equals(fvStoreName) || fvStoreName == null) ? 
extractFeatures:false;
-  if (threadManager != null) {
-
threadManager.setExecutor(req.getCore().getCoreContainer().getUpdateShardHandler().getUpdateExecutor());
-  }
-  final LTRScoringQuery scoringQuery = new LTRScoringQuery(ltrScoringModel,
-  extractEFIParams(localParams),
-  featuresRequestedFromSameStore, threadManager);
-
-  // Enable the feature vector caching if we are extracting features, and 
the features
-  // we requested are the same ones we are reranking with
-  if (featuresRequestedFromSameStore) {
-scoringQuery.setFeatureLogger( 
SolrQueryRequestContextUtils.getFeatureLogger(req) );
+ 
+  LTRScoringQuery[] rerankingQueries = new 
LTRScoringQuery[modelNames.length];
+  for (int i = 0; i < modelNames.length; i++) {
+final LTRScoringQuery rerankingQuery;
+if (!ORIGINAL_RANKING.equals(modelNames[i])) {

Review comment:
   Thanks @cpoerschke ! Tomorrow I'll address all your comments and think 
about them!
   My intent is to merge within a few days, so hopefully, we can close this 
soon!
   Thank you again for your time and contribution, much appreciated!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dweiss commented on a change in pull request #2062: LUCENE-9589 Swedish minimal stemmer

2020-11-04 Thread GitBox



dweiss commented on a change in pull request #2062:
URL: https://github.com/apache/lucene-solr/pull/2062#discussion_r517582098



##
File path: 
lucene/analysis/common/src/java/org/apache/lucene/analysis/sv/SwedishMinimalStemFilterFactory.java
##
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.analysis.sv;
+
+
+import org.apache.lucene.analysis.TokenFilterFactory;
+import org.apache.lucene.analysis.TokenStream;
+
+import java.util.Map;
+
+/** 
+ * Factory for {@link SwedishMinimalStemFilter}.
+ * 

Review comment:
   Would code tag read better here?
   https://reflectoring.io/howto-format-code-snippets-in-javadoc/#pre--code





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] madrob commented on a change in pull request #2055: SOLR-14978 OOM Killer in Foreground

2020-11-04 Thread GitBox



madrob commented on a change in pull request #2055:
URL: https://github.com/apache/lucene-solr/pull/2055#discussion_r517585159



##
File path: solr/docker/include/scripts/solr-fg
##
@@ -15,21 +15,19 @@ if [[ -z "${OOM:-}" ]]; then
   OOM='none'
 fi
 case "$OOM" in

Review comment:
   After some experimentation, I don't think these do anything.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] HoustonPutman commented on a change in pull request #2055: SOLR-14978 OOM Killer in Foreground

2020-11-04 Thread GitBox



HoustonPutman commented on a change in pull request #2055:
URL: https://github.com/apache/lucene-solr/pull/2055#discussion_r517610328



##
File path: solr/bin/solr
##
@@ -2172,6 +2172,16 @@ function start_solr() {
 SOLR_OPTS+=($AUTHC_OPTS)
   fi
 
+  # If a heap dump directory is specified, enable it in SOLR_OPTS
+  if [[ -z "$SOLR_HEAP_DUMP_DIR" ]] && [[ "$SOLR_HEAP_DUMP" == "true" ]]; then
+SOLR_HEAP_DUMP_DIR="${SOLR_LOGS_DIR}/dumps"
+  fi
+  if [[ -n "$SOLR_HEAP_DUMP_DIR" ]]; then
+SOLR_OPTS+=("-XX:+HeapDumpOnOutOfMemoryError")
+SOLR_OPTS+=("-XX:HeapDumpPath=$SOLR_HEAP_DUMP_DIR/solr-$(date 
+%s)-pid$$.hprof")
+mkdir -p "$SOLR_HEAP_DUMP_DIR" 2>/dev/null

Review comment:
   Why are you piping the errors here? `mkdir -p` should be safe right?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9599) Make comparators aware of index sorting

2020-11-04 Thread Mayya Sharipova (Jira)

Mayya Sharipova created LUCENE-9599:
---

 Summary: Make comparators aware of index sorting
 Key: LUCENE-9599
 URL: https://issues.apache.org/jira/browse/LUCENE-9599
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Mayya Sharipova


LUCENE-9280 introduced an ability for comparators to skip non-competitive 
documents. But currently comparators are not aware of index sorting, and are 
not able to early terminate when search sort is equal to index sort. 

Currently, if search sort is equal to index sort, we have an early termination 
in TopFieldCollector. As we work to rely on comparators to provide skipping 
functionality, we would like to move this termination functionality on index 
sort from
 TopFieldCollector to comparators.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] muse-dev[bot] commented on a change in pull request #2010: SOLR-12182: Store URL scheme as replaceable parameter in state.json

2020-11-04 Thread GitBox



muse-dev[bot] commented on a change in pull request #2010:
URL: https://github.com/apache/lucene-solr/pull/2010#discussion_r517612808



##
File path: solr/solrj/src/java/org/apache/solr/common/cloud/ZkCoreNodeProps.java
##
@@ -16,15 +16,20 @@
  */
 package org.apache.solr.common.cloud;
 
+import java.util.Objects;
+
 public class ZkCoreNodeProps {
   private ZkNodeProps nodeProps;
   
   public ZkCoreNodeProps(ZkNodeProps nodeProps) {
+Objects.requireNonNull(nodeProps, "nodeProps should not be null");
 this.nodeProps = nodeProps;
   }
   
   public String getCoreUrl() {
-return getCoreUrl(nodeProps.getStr(ZkStateReader.BASE_URL_PROP), 
nodeProps.getStr(ZkStateReader.CORE_NAME_PROP));
+String baseUrl = nodeProps.getStr(ZkStateReader.BASE_URL_PROP);
+Objects.requireNonNull(baseUrl, "No base_url in ZkNodeProps! 
"+nodeProps.toString());
+return getCoreUrl(baseUrl, nodeProps.getStr(ZkStateReader.CORE_NAME_PROP));

Review comment:
   *NULL_DEREFERENCE:*  object `baseUrl` last assigned on line 30 could be 
null and is dereferenced by call to `getCoreUrl(...)` at line 32.

##
File path: solr/solrj/src/java/org/apache/solr/common/cloud/UrlScheme.java
##
@@ -0,0 +1,268 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.common.cloud;
+
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.net.URLEncoder;
+import java.nio.charset.StandardCharsets;
+import java.util.Map;
+import java.util.Objects;
+import java.util.Optional;
+import java.util.SortedSet;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.ConcurrentMap;
+
+import org.apache.commons.lang.StringUtils;
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.util.Utils;
+import org.apache.zookeeper.KeeperException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import static org.apache.solr.common.cloud.ZkStateReader.URL_SCHEME;
+
+/**
+ * Singleton access to global vars in persisted state, such as the urlScheme, 
which although is stored in ZK as a cluster property
+ * really should be treated like a static global that is set at initialization 
and not altered after.
+ */
+public enum UrlScheme implements LiveNodesListener, ClusterPropertiesListener {
+  INSTANCE;
+
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  public static final String HTTP = "http";
+  public static final String HTTPS = "https";
+  public static final String HTTPS_PORT_PROP = "solr.jetty.https.port";
+  public static final String USE_LIVENODES_URL_SCHEME = 
"ext.useLiveNodesUrlScheme";
+
+  private String urlScheme = HTTP;
+  private boolean useLiveNodesUrlScheme = false;
+  private SortedSet liveNodes = null;
+  private SolrZkClient zkClient = null;
+  private final ConcurrentMap nodeSchemeCache = new 
ConcurrentHashMap<>();
+
+  /**
+   * Called during ZkController initialization to set the urlScheme based on 
cluster properties.
+   * @param client The SolrZkClient needed to read cluster properties from ZK.
+   * @throws IOException If a connection or other I/O related error occurs 
while reading from ZK.
+   */
+  public synchronized void initFromClusterProps(final SolrZkClient client) 
throws IOException {
+this.zkClient = client;
+
+// Have to go directly to the cluster props b/c this needs to happen 
before ZkStateReader does its thing
+ClusterProperties clusterProps = new ClusterProperties(client);
+this.useLiveNodesUrlScheme =
+  
"true".equals(clusterProps.getClusterProperty(UrlScheme.USE_LIVENODES_URL_SCHEME,
 "false"));
+setUrlSchemeFromClusterProps(clusterProps.getClusterProperties());
+  }
+
+  private void setUrlSchemeFromClusterProps(Map props) {
+// Set the global urlScheme from cluster prop or if that is not set, look 
at the urlScheme sys prop
+final String scheme = (String)props.get(ZkStateReader.URL_SCHEME);
+if (StringUtils.isNotEmpty(scheme)) {
+  // track the urlScheme in a global so we can use it during ZK read / 
write operations for cluster state objects
+  this.urlScheme = HTTPS.equals(scheme) ? HTTPS : HTTP;
+} else {
+

[GitHub] [lucene-solr] mayya-sharipova opened a new pull request #2063: LUCENE-9599 Make comparator aware of index sorting

2020-11-04 Thread GitBox



mayya-sharipova opened a new pull request #2063:
URL: https://github.com/apache/lucene-solr/pull/2063


   Currently, if search sort is equal to index sort,  we have an early
   termination in TopFieldCollector. As we work to enhance comparators
   to provide skipping functionality (PR #1351), we would like to
   move this termination functionality on index sort from
   TopFieldCollector to comparators.
   
   This patch does the following:
   - Add method usesIndexSort to LeafFieldComparator
   - Make numeric comparators aware of index sort and early terminate on
 collecting all competitive hits
   - Move TermValComparator and TermOrdValComparator from FieldComparator
 to comparator package, for all comparators to be in the same package
   - Enhance TermValComparator to provide skipping functionality when
 index is sorted
   
   One item left for TODO for a following PR is to remove the logic of
   early termniation from TopFieldCollector. We can do that once
   we ensure that all BulkScorers are using iterators from collectors
   than can skip non-competitive docs.
   
   Relates to #1351



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] madrob commented on a change in pull request #2055: SOLR-14978 OOM Killer in Foreground

2020-11-04 Thread GitBox



madrob commented on a change in pull request #2055:
URL: https://github.com/apache/lucene-solr/pull/2055#discussion_r517648922



##
File path: solr/bin/solr
##
@@ -2172,6 +2172,16 @@ function start_solr() {
 SOLR_OPTS+=($AUTHC_OPTS)
   fi
 
+  # If a heap dump directory is specified, enable it in SOLR_OPTS
+  if [[ -z "$SOLR_HEAP_DUMP_DIR" ]] && [[ "$SOLR_HEAP_DUMP" == "true" ]]; then
+SOLR_HEAP_DUMP_DIR="${SOLR_LOGS_DIR}/dumps"
+  fi
+  if [[ -n "$SOLR_HEAP_DUMP_DIR" ]]; then
+SOLR_OPTS+=("-XX:+HeapDumpOnOutOfMemoryError")
+SOLR_OPTS+=("-XX:HeapDumpPath=$SOLR_HEAP_DUMP_DIR/solr-$(date 
+%s)-pid$$.hprof")
+mkdir -p "$SOLR_HEAP_DUMP_DIR" 2>/dev/null

Review comment:
   I was doing it to match the call at 
https://github.com/apache/lucene-solr/blob/8f75d6ff4f0d981136f64a28c173b5ac4a7263ec/solr/bin/solr#L2243
 but that also had additional checks afterwards. I extracted the commonality 
here and call that function instead.
   
   mkdir can fail if you don't have permissions to that directory path, which 
would also indicate that the process won't have permissions later to write the 
log files.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14923) Indexing performance is unacceptable when child documents are involved

2020-11-04 Thread David Smiley (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226412#comment-17226412
 ] 

David Smiley commented on SOLR-14923:
-

I've had trouble prioritizing this because it requires many hours to 
investigate through code I don't like.  I'll try to give you some answers 
without (yet) really digging in:

bq.  Would it be sufficient to track the document ids which require a reload 
and clear them on each openRealTimeSearcher call?

Where would the ID tracking you refer to _go_ (whose responsibility is it)?  I 
don't think UpdateLog.  
org.apache.solr.update.processor.DistributedUpdateProcessor is doing a lot 
already.  Thinking back to my suggestion back on SOLR-12638, I think I was 
referring RTGComponent because Mosh said that this guy was the thing that was 
involved for this use-case.  And I was not imagining tracking an ever growing 
list of IDs somewhere; I think just some sort of dirty flag on RTGComponent.  
See the variable "mustUseRealtimeSearcher" there -- maybe we could make it get 
and clear some AtomicReference or something.  It's worth a shot but it 
feels inelegant... I lack the deeper understanding as to why 
UpdateLog.openRealtimeSearcher must be called at all.  Mosh at the time said 
"RTGComponent is not aware of the newly indexed yet not committed child docs.". 
 This is foggy to me but I don't know why RTGComponent should be aware at all; 
I don't recall how RTGComponent is involved in the whole thing.  Maybe between 
you and me, we shall figure this out :-)

[~markrmil...@gmail.com]: AFAICT you originally added 
{{UpdateLog.openRealtimeSearcher}}.  Why is it located _there_ instead of, say, 
UpdateHandler? I'm honestly confused that UpdateLog refers to the index 
altogether; it should be independent according to my conceptual understanding.  
When there isn't an updateLog (it's technically optional), then there may be a 
bug because the reader probably needs to be re-opened still.

bq. What should be the result of two concurrent updates on the same document?  
I think it is the same as with normal atomic updates, and due the the fact the 
there is no rollback on transactions this can only be detected by versioning.

Yes; that's logical to me.

> Indexing performance is unacceptable when child documents are involved
> --
>
> Key: SOLR-14923
> URL: https://issues.apache.org/jira/browse/SOLR-14923
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: update, UpdateRequestProcessors
>Affects Versions: master (9.0), 8.3, 8.4, 8.5, 8.6
>Reporter: Thomas Wöckinger
>Priority: Critical
>  Labels: performance
>
> Parallel indexing does not make sense at moment when child documents are used.
> The org.apache.solr.update.processor.DistributedUpdateProcessor checks at the 
> end of the method doVersionAdd if Ulog caches should be refreshed.
> This check will return true if any child document is included in the 
> AddUpdateCommand.
> If so ulog.openRealtimeSearcher(); is called, this call is very expensive, 
> and executed in a synchronized block of the UpdateLog instance, therefore all 
> other operations on the UpdateLog are blocked too.
> Because every important UpdateLog method (add, delete, ...) is done using a 
> synchronized block almost each operation is blocked.
> This reduces multi threaded index update to a single thread behavior.
> The described behavior is not depending on any option of the UpdateRequest, 
> so it does not make any difference if 'waitFlush', 'waitSearcher' or 
> 'softCommit'  is true or false.
> The described behavior makes the usage of ChildDocuments useless, because the 
> performance is unacceptable.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] madrob merged pull request #2055: SOLR-14978 OOM Killer in Foreground

2020-11-04 Thread GitBox



madrob merged pull request #2055:
URL: https://github.com/apache/lucene-solr/pull/2055


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14978) Run OOM Killer Script in Foreground Solr

2020-11-04 Thread Mike Drob (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226424#comment-17226424
 ] 

Mike Drob commented on SOLR-14978:
--

This has been committed to 9.0, leaving open while I figure out if this can go 
into 8.x as well

> Run OOM Killer Script in Foreground Solr
> 
>
> Key: SOLR-14978
> URL: https://issues.apache.org/jira/browse/SOLR-14978
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: scripts and tools
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
>  Labels: operability
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> As discussed in SOLR-8145 before the advent of Docker containers, Solr did 
> not usually run unattended and in the foreground so there was not much reason 
> to run the OOM killer script in that mode. However, not it makes sense to 
> handle in foreground as well.
> We should also consider making it easier to configure heap dumps as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14978) Run OOM Killer Script in Foreground Solr

2020-11-04 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226425#comment-17226425
 ] 

ASF subversion and git services commented on SOLR-14978:


Commit 7c1ff288b73b053cc9d17c6d4db4b35ed6c5559a in lucene-solr's branch 
refs/heads/master from Mike Drob
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=7c1ff28 ]

SOLR-14978 OOM Killer in Foreground (#2055)

Combine Docker and bin/solr OOM handling scripts, move OOM handling to 
foreground Solr as well.

Co-authored-by: Houston Putman 

> Run OOM Killer Script in Foreground Solr
> 
>
> Key: SOLR-14978
> URL: https://issues.apache.org/jira/browse/SOLR-14978
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: scripts and tools
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
>  Labels: operability
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> As discussed in SOLR-8145 before the advent of Docker containers, Solr did 
> not usually run unattended and in the foreground so there was not much reason 
> to run the OOM killer script in that mode. However, not it makes sense to 
> handle in foreground as well.
> We should also consider making it easier to configure heap dumps as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14978) Run OOM Killer Script in Foreground Solr

2020-11-04 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226426#comment-17226426
 ] 

ASF subversion and git services commented on SOLR-14978:


Commit 7c1ff288b73b053cc9d17c6d4db4b35ed6c5559a in lucene-solr's branch 
refs/heads/master from Mike Drob
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=7c1ff28 ]

SOLR-14978 OOM Killer in Foreground (#2055)

Combine Docker and bin/solr OOM handling scripts, move OOM handling to 
foreground Solr as well.

Co-authored-by: Houston Putman 

> Run OOM Killer Script in Foreground Solr
> 
>
> Key: SOLR-14978
> URL: https://issues.apache.org/jira/browse/SOLR-14978
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: scripts and tools
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
>  Labels: operability
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> As discussed in SOLR-8145 before the advent of Docker containers, Solr did 
> not usually run unattended and in the foreground so there was not much reason 
> to run the OOM killer script in that mode. However, not it makes sense to 
> handle in foreground as well.
> We should also consider making it easier to configure heap dumps as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14978) Run OOM Killer Script in Foreground Solr

2020-11-04 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226428#comment-17226428
 ] 

ASF subversion and git services commented on SOLR-14978:


Commit 7c1ff288b73b053cc9d17c6d4db4b35ed6c5559a in lucene-solr's branch 
refs/heads/master from Mike Drob
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=7c1ff28 ]

SOLR-14978 OOM Killer in Foreground (#2055)

Combine Docker and bin/solr OOM handling scripts, move OOM handling to 
foreground Solr as well.

Co-authored-by: Houston Putman 

> Run OOM Killer Script in Foreground Solr
> 
>
> Key: SOLR-14978
> URL: https://issues.apache.org/jira/browse/SOLR-14978
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: scripts and tools
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
>  Labels: operability
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> As discussed in SOLR-8145 before the advent of Docker containers, Solr did 
> not usually run unattended and in the foreground so there was not much reason 
> to run the OOM killer script in that mode. However, not it makes sense to 
> handle in foreground as well.
> We should also consider making it easier to configure heap dumps as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14978) Run OOM Killer Script in Foreground Solr

2020-11-04 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226427#comment-17226427
 ] 

ASF subversion and git services commented on SOLR-14978:


Commit 7c1ff288b73b053cc9d17c6d4db4b35ed6c5559a in lucene-solr's branch 
refs/heads/master from Mike Drob
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=7c1ff28 ]

SOLR-14978 OOM Killer in Foreground (#2055)

Combine Docker and bin/solr OOM handling scripts, move OOM handling to 
foreground Solr as well.

Co-authored-by: Houston Putman 

> Run OOM Killer Script in Foreground Solr
> 
>
> Key: SOLR-14978
> URL: https://issues.apache.org/jira/browse/SOLR-14978
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: scripts and tools
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
>  Labels: operability
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> As discussed in SOLR-8145 before the advent of Docker containers, Solr did 
> not usually run unattended and in the foreground so there was not much reason 
> to run the OOM killer script in that mode. However, not it makes sense to 
> handle in foreground as well.
> We should also consider making it easier to configure heap dumps as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14923) Indexing performance is unacceptable when child documents are involved

2020-11-04 Thread Mark Robert Miller (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226436#comment-17226436
 ] 

Mark Robert Miller commented on SOLR-14923:
---

I’ll have to look to see. It doesn’t sound like my area at the time, i would 
assume Yonik and maybe somehow came through me. If I recall right, I thought 
that open is for RTG in general and not just child docs. There are a variety of 
other reasons child and atomic docs are slow though. I’ve done a lot on that 
path On the ref branch, would be interesting to see how much better it is, but 
it doesn’t change the fact that those kind of docs do synchronous instead of 
asynchronous updates. Things will also depend how you are sending updates - 
streaming, batching, or singles. 

> Indexing performance is unacceptable when child documents are involved
> --
>
> Key: SOLR-14923
> URL: https://issues.apache.org/jira/browse/SOLR-14923
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: update, UpdateRequestProcessors
>Affects Versions: master (9.0), 8.3, 8.4, 8.5, 8.6
>Reporter: Thomas Wöckinger
>Priority: Critical
>  Labels: performance
>
> Parallel indexing does not make sense at moment when child documents are used.
> The org.apache.solr.update.processor.DistributedUpdateProcessor checks at the 
> end of the method doVersionAdd if Ulog caches should be refreshed.
> This check will return true if any child document is included in the 
> AddUpdateCommand.
> If so ulog.openRealtimeSearcher(); is called, this call is very expensive, 
> and executed in a synchronized block of the UpdateLog instance, therefore all 
> other operations on the UpdateLog are blocked too.
> Because every important UpdateLog method (add, delete, ...) is done using a 
> synchronized block almost each operation is blocked.
> This reduces multi threaded index update to a single thread behavior.
> The described behavior is not depending on any option of the UpdateRequest, 
> so it does not make any difference if 'waitFlush', 'waitSearcher' or 
> 'softCommit'  is true or false.
> The described behavior makes the usage of ChildDocuments useless, because the 
> performance is unacceptable.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] muse-dev[bot] commented on a change in pull request #2010: SOLR-12182: Store URL scheme as replaceable parameter in state.json

2020-11-04 Thread GitBox



muse-dev[bot] commented on a change in pull request #2010:
URL: https://github.com/apache/lucene-solr/pull/2010#discussion_r517712472



##
File path: solr/solrj/src/java/org/apache/solr/common/cloud/Replica.java
##
@@ -235,11 +234,11 @@ public String getName() {
   }
 
   public String getCoreUrl() {
-return ZkCoreNodeProps.getCoreUrl(getStr(ZkStateReader.BASE_URL_PROP), 
core);
+return ZkCoreNodeProps.getCoreUrl(getBaseUrl(), core);

Review comment:
   *NULL_DEREFERENCE:*  object returned by `getBaseUrl()` could be null and 
is dereferenced by call to `getCoreUrl(...)` at line 237.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14923) Indexing performance is unacceptable when child documents are involved

2020-11-04 Thread Mark Robert Miller (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226449#comment-17226449
 ] 

Mark Robert Miller commented on SOLR-14923:
---

I’ll take a look in the next few days, but FYI, my recollection is that that 
reopen was for all updates and is required for RTG. I’ve played around with 
moving it out of the sync block and I believe that broke RTG. Making those 
syncs “fair” helps.

If you blast in updates, as they move through the dist update path, they get 
and release Update lock many times. So updates will pile up on those locks, and 
they will get the lock in random order. So a bunch of updates that are trying 
to finish May get caught waiting for a bunch of new updates looking to start or 
do a middle part. Child and atomic updates, that do synchronous distribution or 
need to wait for a dependent update, clogs things up anymore. Making the locks 
fair at helps updates that came in first to not get caught waiting for all 
these updates coming in later. 

> Indexing performance is unacceptable when child documents are involved
> --
>
> Key: SOLR-14923
> URL: https://issues.apache.org/jira/browse/SOLR-14923
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: update, UpdateRequestProcessors
>Affects Versions: master (9.0), 8.3, 8.4, 8.5, 8.6
>Reporter: Thomas Wöckinger
>Priority: Critical
>  Labels: performance
>
> Parallel indexing does not make sense at moment when child documents are used.
> The org.apache.solr.update.processor.DistributedUpdateProcessor checks at the 
> end of the method doVersionAdd if Ulog caches should be refreshed.
> This check will return true if any child document is included in the 
> AddUpdateCommand.
> If so ulog.openRealtimeSearcher(); is called, this call is very expensive, 
> and executed in a synchronized block of the UpdateLog instance, therefore all 
> other operations on the UpdateLog are blocked too.
> Because every important UpdateLog method (add, delete, ...) is done using a 
> synchronized block almost each operation is blocked.
> This reduces multi threaded index update to a single thread behavior.
> The described behavior is not depending on any option of the UpdateRequest, 
> so it does not make any difference if 'waitFlush', 'waitSearcher' or 
> 'softCommit'  is true or false.
> The described behavior makes the usage of ChildDocuments useless, because the 
> performance is unacceptable.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-14923) Indexing performance is unacceptable when child documents are involved

2020-11-04 Thread Mark Robert Miller (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226449#comment-17226449
 ] 

Mark Robert Miller edited comment on SOLR-14923 at 11/5/20, 1:45 AM:
-

I’ll take a look in the next few days, but FYI, my recollection is that that 
reopen was for all updates and is required for RTG. I’ve played around with 
moving it out of the sync block and I believe that broke RTG. Making those 
syncs “fair” helps.

If you blast in updates, as they move through the dist update path, they get 
and release the UpdateLog lock many times. So updates will pile up on those 
locks, and they will get the lock in random order. So a bunch of updates that 
are trying to finish may get caught waiting for a bunch of new updates looking 
to start or do a middle part. Child and atomic updates, that do synchronous 
distribution or need to wait for a dependent update, clogs things up even more. 
Making the locks fair at helps updates that came in first to not get caught 
waiting for all these updates coming in later. 


was (Author: markrmiller):
I’ll take a look in the next few days, but FYI, my recollection is that that 
reopen was for all updates and is required for RTG. I’ve played around with 
moving it out of the sync block and I believe that broke RTG. Making those 
syncs “fair” helps.

If you blast in updates, as they move through the dist update path, they get 
and release Update lock many times. So updates will pile up on those locks, and 
they will get the lock in random order. So a bunch of updates that are trying 
to finish May get caught waiting for a bunch of new updates looking to start or 
do a middle part. Child and atomic updates, that do synchronous distribution or 
need to wait for a dependent update, clogs things up anymore. Making the locks 
fair at helps updates that came in first to not get caught waiting for all 
these updates coming in later. 

> Indexing performance is unacceptable when child documents are involved
> --
>
> Key: SOLR-14923
> URL: https://issues.apache.org/jira/browse/SOLR-14923
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: update, UpdateRequestProcessors
>Affects Versions: master (9.0), 8.3, 8.4, 8.5, 8.6
>Reporter: Thomas Wöckinger
>Priority: Critical
>  Labels: performance
>
> Parallel indexing does not make sense at moment when child documents are used.
> The org.apache.solr.update.processor.DistributedUpdateProcessor checks at the 
> end of the method doVersionAdd if Ulog caches should be refreshed.
> This check will return true if any child document is included in the 
> AddUpdateCommand.
> If so ulog.openRealtimeSearcher(); is called, this call is very expensive, 
> and executed in a synchronized block of the UpdateLog instance, therefore all 
> other operations on the UpdateLog are blocked too.
> Because every important UpdateLog method (add, delete, ...) is done using a 
> synchronized block almost each operation is blocked.
> This reduces multi threaded index update to a single thread behavior.
> The described behavior is not depending on any option of the UpdateRequest, 
> so it does not make any difference if 'waitFlush', 'waitSearcher' or 
> 'softCommit'  is true or false.
> The described behavior makes the usage of ChildDocuments useless, because the 
> performance is unacceptable.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-14923) Indexing performance is unacceptable when child documents are involved

2020-11-04 Thread Mark Robert Miller (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226449#comment-17226449
 ] 

Mark Robert Miller edited comment on SOLR-14923 at 11/5/20, 1:52 AM:
-

I’ll take a look in the next few days, but FYI, my recollection is that that 
reopen was for all updates and is required for RTG. I’ve played around with 
moving it out of the sync block and I believe that broke RTG. Making those 
syncs “fair” helps.

If you blast in updates, as they move through the dist update path, they get 
and release the UpdateLog lock many times. So updates will pile up on those 
locks, and they will get the lock in random order. So a bunch of updates that 
are trying to finish may get caught waiting for a bunch of new updates looking 
to start or do a middle part. Child and atomic updates, that do synchronous 
distribution or need to wait for a dependent update, clogs things up even more. 
Making the locks fair at helps updates that came in first to not get caught 
waiting for all these updates coming in later.

(Also, while ive made that whole path much, much better, I’ve still been toying 
with the idea of making RTG an optional feature - I think it’s used more rarely 
and it’s a shame to pay for it if you don’t use it at all - peer sync uses RTG, 
but you can likely keep that working in a way that you don’t pay as you do now)


was (Author: markrmiller):
I’ll take a look in the next few days, but FYI, my recollection is that that 
reopen was for all updates and is required for RTG. I’ve played around with 
moving it out of the sync block and I believe that broke RTG. Making those 
syncs “fair” helps.

If you blast in updates, as they move through the dist update path, they get 
and release the UpdateLog lock many times. So updates will pile up on those 
locks, and they will get the lock in random order. So a bunch of updates that 
are trying to finish may get caught waiting for a bunch of new updates looking 
to start or do a middle part. Child and atomic updates, that do synchronous 
distribution or need to wait for a dependent update, clogs things up even more. 
Making the locks fair at helps updates that came in first to not get caught 
waiting for all these updates coming in later. 

> Indexing performance is unacceptable when child documents are involved
> --
>
> Key: SOLR-14923
> URL: https://issues.apache.org/jira/browse/SOLR-14923
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: update, UpdateRequestProcessors
>Affects Versions: master (9.0), 8.3, 8.4, 8.5, 8.6
>Reporter: Thomas Wöckinger
>Priority: Critical
>  Labels: performance
>
> Parallel indexing does not make sense at moment when child documents are used.
> The org.apache.solr.update.processor.DistributedUpdateProcessor checks at the 
> end of the method doVersionAdd if Ulog caches should be refreshed.
> This check will return true if any child document is included in the 
> AddUpdateCommand.
> If so ulog.openRealtimeSearcher(); is called, this call is very expensive, 
> and executed in a synchronized block of the UpdateLog instance, therefore all 
> other operations on the UpdateLog are blocked too.
> Because every important UpdateLog method (add, delete, ...) is done using a 
> synchronized block almost each operation is blocked.
> This reduces multi threaded index update to a single thread behavior.
> The described behavior is not depending on any option of the UpdateRequest, 
> so it does not make any difference if 'waitFlush', 'waitSearcher' or 
> 'softCommit'  is true or false.
> The described behavior makes the usage of ChildDocuments useless, because the 
> performance is unacceptable.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14977) Container plugins need a way to be configured

2020-11-04 Thread Noble Paul (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226465#comment-17226465
 ] 

Noble Paul commented on SOLR-14977:
---

{quote}Placement plugin config (names, types and values) are decided by the 
plugin writer, so can't be hard coded in a Solr class.
{quote}
Yes, you are right. The patch takes care of all those concerns. The patch 
applies on branch {{jira/solr-14749-api}}

> Container plugins need a way to be configured
> -
>
> Key: SOLR-14977
> URL: https://issues.apache.org/jira/browse/SOLR-14977
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Plugin system
>Reporter: Andrzej Bialecki
>Priority: Major
> Attachments: SOLR-14977.patch
>
>
> Container plugins are defined in {{/clusterprops.json:/plugin}} using a 
> simple {{PluginMeta}} bean. This is sufficient for implementations that don't 
> need any configuration except for the {{pathPrefix}} but insufficient for 
> anything else that needs more configuration parameters.
> An example would be a {{CollectionsRepairEventListener}} plugin proposed in 
> PR-1962, which needs parameters such as the list of collections, {{waitFor}}, 
> maximum operations allowed, etc. to properly function.
> This issue proposes to extend the {{PluginMeta}} bean to allow a 
> {{Map}} configuration parameters.
> There is an interface that we could potentially use ({{MapInitializedPlugin}} 
> but it works only with {{String}} values. This is not optimal because it 
> requires additional type-safety validation from the consumers. The existing 
> {{PluginInfo}} / {{PluginInfoInitialized}} interface is too complex for this 
> purpose.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9588) Exceptions handling in methods of SegmentingTokenizerBase

2020-11-04 Thread Nguyen Minh Gia Huy (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226471#comment-17226471
 ] 

Nguyen Minh Gia Huy commented on LUCENE-9588:
-

I wonder what should be the appropriate usage of this class ?

Let's say I want a Tokenizer that breaks the text into sentences and send each 
sentence to another tokenizer, for example WhiteSpaceTokenizer, for 
segmentation.To do so, I would have to make that tokenizer implement the 
SegmentingTokenizerBase and invoke the WhiteSpaceTokenizer in the 
*incrementWord* method. WhiteSpaceTokenizer implements the Tokenizer so it 
throws I/O exception during analysis. 

How the I/O and segmentation could be separated in such cases ?  Is 
SegmentingTokenizerBase intended to limit the usage for only non-i/o 
segmentation e.g. 
[HMMChineseTokenizer|https://github.com/apache/lucene-solr/blob/master/lucene/analysis/smartcn/src/java/org/apache/lucene/analysis/cn/smart/HMMChineseTokenizer.java#L46]
 splits sentence by WordSegmenter, which don't require I/O handling ?

> Exceptions handling in methods of SegmentingTokenizerBase
> -
>
> Key: LUCENE-9588
> URL: https://issues.apache.org/jira/browse/LUCENE-9588
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/analysis
>Affects Versions: 8.6.3
>Reporter: Nguyen Minh Gia Huy
>Priority: Minor
>
> The current interface of *setNextSentence* and *incrementWord* methods in 
> *SegmentingTokenizerBase* do not define the checked exceptions, which makes 
> it troublesome to be inherited.
> For example, if we override the incrementWord  with a logic that invoke  
> incrementToken on another tokenizer, the incrementToken raises the 
> IOException but the incrementWord is not defined to handle it.
> I think having setNextSentence and incrementWord handle the IOException would 
> make the SegmentingTokenizerBase easier to be used.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-9319) Clean up Sandbox project by retiring/delete functionality or move it to Lucene core

2020-11-04 Thread Tomoko Uchida (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomoko Uchida resolved LUCENE-9319.
---
Fix Version/s: master (9.0)
 Assignee: Tomoko Uchida
   Resolution: Fixed

> Clean up Sandbox project by retiring/delete functionality or move it to 
> Lucene core
> ---
>
> Key: LUCENE-9319
> URL: https://issues.apache.org/jira/browse/LUCENE-9319
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Affects Versions: master (9.0)
>Reporter: David Ryan
>Assignee: Tomoko Uchida
>Priority: Major
>  Labels: build, features
> Fix For: master (9.0)
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> To allow Lucene to be modularised with Java module system there are a few 
> preparatory tasks to be completed prior to this being possible. These are 
> detailed by Uwe on the mailing list here:
> [http://mail-archives.apache.org/mod_mbox/lucene-dev/202004.mbox/%3c0a5e01d60ff2$563f9c80$02bed580$@thetaphi.de%3e]
>  
> The lucene-sandbox currently shares package names with lucene-core which is 
> not allowed in the Java module system.  There are two ways to deal with this. 
> Either prefix all packages with "sandbox" or retire the lucene-sandbox all 
> together. As per the email:
> {quote}Cleanup sandbox to prefix all classes there with “sandbox” package and 
> where needed remove package-private access. If it’s needed for internal 
> access, WTF: Just move the stuff to core! We have a new version 9.0, so 
> either retire/delete Sandbox stuff or make it part of Lucene core.
> {quote}
>  The suggested way forward is to move sandbox code to core.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9600) Clean up package name conflicts for misc module

2020-11-04 Thread Tomoko Uchida (Jira)

Tomoko Uchida created LUCENE-9600:
-

 Summary: Clean up package name conflicts for misc module
 Key: LUCENE-9600
 URL: https://issues.apache.org/jira/browse/LUCENE-9600
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/misc
Affects Versions: master (9.0)
Reporter: Tomoko Uchida
Assignee: Tomoko Uchida


misc module shares the package names o.a.l.document, o.a.l.index, o.a.l.search, 
o.a.l.store, and o.a.l.util with lucene-core. Those should be moved under 
o.a.l.misc (or some classed should be moved to core?).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-9499) Clean up package name conflicts between modules (split packages)

2020-11-04 Thread Tomoko Uchida (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomoko Uchida updated LUCENE-9499:
--
Description: 
We have lots of package name conflicts (shared package names) between modules 
in the source tree. It is not only annoying for devs/users but also indeed bad 
practice since Java 9 (according to my understanding), and we already have some 
problems with Javadocs due to these splitted packages as some of us would know. 
Also split packages make migrating to the Java 9 module system impossible.

This is the placeholder to fix all package name conflicts in Lucene.

See the dev list thread for more background. 
 
[https://lists.apache.org/thread.html/r6496963e89a5e0615e53206429b6843cc5d3e923a2045cc7b7a1eb03%40%3Cdev.lucene.apache.org%3E]

Modules that need to be fixed / cleaned up:
 - analyzers-common (LUCENE-9317)
 - analyzers-icu (LUCENE-9558)
 - backward-codecs (LUCENE-9318)
 - sandbox (LUCENE-9319)
 - misc (LUCENE-9600)
 - (test-framework: this can be excluded for the moment)

Also lucene-core will be heavily affected (some classes have to be moved into 
{{core}}, or some classes' and methods' in {{core}} visibility have to be 
relaxed).

Probably most work would be done in a parallel manner, but conflicts can 
happen. If someone want to help out, please open an issue before working and 
share your thoughts with me and others.

I set "Fix version" to 9.0 - means once we make a commit on here, this will be 
a blocker for release 9.0.0. (I don't think the changes should be delivered 
across two major releases; all changes have to be out at once in a major 
release.) If there are any objections or concerns, please leave comments. For 
now I have no idea about the total volume of changes or technical obstacles 
that have to be handled.

  was:
We have lots of package name conflicts (shared package names) between modules 
in the source tree. It is not only annoying for devs/users but also indeed bad 
practice since Java 9 (according to my understanding), and we already have some 
problems with Javadocs due to these splitted packages as some of us would know. 
Also split packages make migrating to the Java 9 module system impossible.

This is the placeholder to fix all package name conflicts in Lucene.

See the dev list thread for more background. 
 
[https://lists.apache.org/thread.html/r6496963e89a5e0615e53206429b6843cc5d3e923a2045cc7b7a1eb03%40%3Cdev.lucene.apache.org%3E]

Modules that need to be fixed / cleaned up:
 - analyzers-common (LUCENE-9317)
 - analyzers-icu (LUCENE-9558)
 - backward-codecs (LUCENE-9318)
 - sandbox (LUCENE-9319)
 - misc
 - (test-framework: this can be excluded for the moment)

Also lucene-core will be heavily affected (some classes have to be moved into 
{{core}}, or some classes' and methods' in {{core}} visibility have to be 
relaxed).

Probably most work would be done in a parallel manner, but conflicts can 
happen. If someone want to help out, please open an issue before working and 
share your thoughts with me and others.

I set "Fix version" to 9.0 - means once we make a commit on here, this will be 
a blocker for release 9.0.0. (I don't think the changes should be delivered 
across two major releases; all changes have to be out at once in a major 
release.) If there are any objections or concerns, please leave comments. For 
now I have no idea about the total volume of changes or technical obstacles 
that have to be handled.


> Clean up package name conflicts between modules (split packages)
> 
>
> Key: LUCENE-9499
> URL: https://issues.apache.org/jira/browse/LUCENE-9499
> Project: Lucene - Core
>  Issue Type: Improvement
>Affects Versions: master (9.0)
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Major
> Fix For: master (9.0)
>
>
> We have lots of package name conflicts (shared package names) between modules 
> in the source tree. It is not only annoying for devs/users but also indeed 
> bad practice since Java 9 (according to my understanding), and we already 
> have some problems with Javadocs due to these splitted packages as some of us 
> would know. Also split packages make migrating to the Java 9 module system 
> impossible.
> This is the placeholder to fix all package name conflicts in Lucene.
> See the dev list thread for more background. 
>  
> [https://lists.apache.org/thread.html/r6496963e89a5e0615e53206429b6843cc5d3e923a2045cc7b7a1eb03%40%3Cdev.lucene.apache.org%3E]
> Modules that need to be fixed / cleaned up:
>  - analyzers-common (LUCENE-9317)
>  - analyzers-icu (LUCENE-9558)
>  - backward-codecs (LUCENE-9318)
>  - sandbox (LUCENE-9319)
>  - misc (LUCENE-9600)
>  - (test-framework: this can be excluded for the moment)
> Also lucene-core will be heavily affected (some

[jira] [Issue Comment Deleted] (LUCENE-9588) Exceptions handling in methods of SegmentingTokenizerBase

2020-11-04 Thread Nguyen Minh Gia Huy (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nguyen Minh Gia Huy updated LUCENE-9588:

Comment: was deleted

(was: I wonder what should be the appropriate usage of this class ?

Let's say I want a Tokenizer that breaks the text into sentences and send each 
sentence to another tokenizer, for example WhiteSpaceTokenizer, for 
segmentation.To do so, I would have to make that tokenizer implement the 
SegmentingTokenizerBase and invoke the WhiteSpaceTokenizer in the 
*incrementWord* method. WhiteSpaceTokenizer implements the Tokenizer so it 
throws I/O exception during analysis. 

How the I/O and segmentation could be separated in such cases ?  Is 
SegmentingTokenizerBase intended to limit the usage for only non-i/o 
segmentation e.g. 
[HMMChineseTokenizer|https://github.com/apache/lucene-solr/blob/master/lucene/analysis/smartcn/src/java/org/apache/lucene/analysis/cn/smart/HMMChineseTokenizer.java#L46]
 splits sentence by WordSegmenter, which don't require I/O handling ?)

> Exceptions handling in methods of SegmentingTokenizerBase
> -
>
> Key: LUCENE-9588
> URL: https://issues.apache.org/jira/browse/LUCENE-9588
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/analysis
>Affects Versions: 8.6.3
>Reporter: Nguyen Minh Gia Huy
>Priority: Minor
>
> The current interface of *setNextSentence* and *incrementWord* methods in 
> *SegmentingTokenizerBase* do not define the checked exceptions, which makes 
> it troublesome to be inherited.
> For example, if we override the incrementWord  with a logic that invoke  
> incrementToken on another tokenizer, the incrementToken raises the 
> IOException but the incrementWord is not defined to handle it.
> I think having setNextSentence and incrementWord handle the IOException would 
> make the SegmentingTokenizerBase easier to be used.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-site] johtani opened a new pull request #33: Fix change log URL

2020-11-04 Thread GitBox



johtani opened a new pull request #33:
URL: https://github.com/apache/lucene-site/pull/33


   Add patch number in URL



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9588) Exceptions handling in methods of SegmentingTokenizerBase

2020-11-04 Thread Nguyen Minh Gia Huy (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226541#comment-17226541
 ] 

Nguyen Minh Gia Huy commented on LUCENE-9588:
-

Sorry, I didn't explain the example with JapaneseTokenizer very well.

Let's say I want a Tokenizer that breaks the text into sentences and send each 
sentence to another tokenizer, for example JapaneseTokenizer, for segmentation 
( so that the JapaneseTokenizer doesn't analyze the whole paragraph but instead 
each sentence) .To do so, I would have to make that tokenizer implement the 
SegmentingTokenizerBase and invoke the JapaneseTokenizer in the *incrementWord* 
method. JapaneseTokenizer implements the Tokenizer so it throws I/O exception 
during analysis. 

For this specific use case, the *incrementToken* of JapaneseTokenizer ( and any 
other class implements Tokenizer) combines i/o and segmentation and there seems 
no way to separate them.

I agree with the idea that SegmentingTokenizerBase handles the I/O itself and 
that the subclass just deals with only word segmentation. However, the word 
segmentation probably shouldn't be limited to only non-i/o logic. The existing 
subclasses of SegmentingTokenizerBase don't have such issue because they don't 
do word segmentation in Tokenizer style, for example 
[WordSegmenter|https://github.com/apache/lucene-solr/blob/master/lucene/analysis/smartcn/src/java/org/apache/lucene/analysis/cn/smart/HMMChineseTokenizer.java#L46]
 in HMMChineseTokenizer. I think allowing I/O exception in *setNextSentence* 
and *incrementWord* ** will make users have more flexibility with the word 
segmentation and thus improve the usability of this class.

> Exceptions handling in methods of SegmentingTokenizerBase
> -
>
> Key: LUCENE-9588
> URL: https://issues.apache.org/jira/browse/LUCENE-9588
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/analysis
>Affects Versions: 8.6.3
>Reporter: Nguyen Minh Gia Huy
>Priority: Minor
>
> The current interface of *setNextSentence* and *incrementWord* methods in 
> *SegmentingTokenizerBase* do not define the checked exceptions, which makes 
> it troublesome to be inherited.
> For example, if we override the incrementWord  with a logic that invoke  
> incrementToken on another tokenizer, the incrementToken raises the 
> IOException but the incrementWord is not defined to handle it.
> I think having setNextSentence and incrementWord handle the IOException would 
> make the SegmentingTokenizerBase easier to be used.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-9588) Exceptions handling in methods of SegmentingTokenizerBase

2020-11-04 Thread Nguyen Minh Gia Huy (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226541#comment-17226541
 ] 

Nguyen Minh Gia Huy edited comment on LUCENE-9588 at 11/5/20, 7:24 AM:
---

Sorry, I didn't explain the example with JapaneseTokenizer very well.

Let's say I want a Tokenizer that breaks the text into sentences and send each 
sentence to another tokenizer, for example JapaneseTokenizer, for segmentation 
( so that the JapaneseTokenizer doesn't analyze the whole paragraph but instead 
each sentence) .To do so, I would have to make that tokenizer implement the 
SegmentingTokenizerBase and invoke the JapaneseTokenizer in the *incrementWord* 
method. JapaneseTokenizer implements the Tokenizer so it throws I/O exception 
during analysis.

For this specific use case, the *incrementToken* of JapaneseTokenizer ( and any 
other class implements Tokenizer) combines i/o and segmentation and there seems 
no way to separate them.

I agree with the idea that SegmentingTokenizerBase handles the I/O itself and 
that the subclass just deals with only word segmentation. However, the word 
segmentation probably shouldn't be limited to only non-i/o logic. The existing 
subclasses of SegmentingTokenizerBase don't have such issue because they don't 
do word segmentation in Tokenizer style, for example 
[WordSegmenter|https://github.com/apache/lucene-solr/blob/master/lucene/analysis/smartcn/src/java/org/apache/lucene/analysis/cn/smart/HMMChineseTokenizer.java#L46]
 in HMMChineseTokenizer. I think allowing I/O exception in *setNextSentence* 
and *incrementWord* will let users have more flexibility with the word 
segmentation choices and thus improve the usability of this class.


was (Author: huynmg):
Sorry, I didn't explain the example with JapaneseTokenizer very well.

Let's say I want a Tokenizer that breaks the text into sentences and send each 
sentence to another tokenizer, for example JapaneseTokenizer, for segmentation 
( so that the JapaneseTokenizer doesn't analyze the whole paragraph but instead 
each sentence) .To do so, I would have to make that tokenizer implement the 
SegmentingTokenizerBase and invoke the JapaneseTokenizer in the *incrementWord* 
method. JapaneseTokenizer implements the Tokenizer so it throws I/O exception 
during analysis. 

For this specific use case, the *incrementToken* of JapaneseTokenizer ( and any 
other class implements Tokenizer) combines i/o and segmentation and there seems 
no way to separate them.

I agree with the idea that SegmentingTokenizerBase handles the I/O itself and 
that the subclass just deals with only word segmentation. However, the word 
segmentation probably shouldn't be limited to only non-i/o logic. The existing 
subclasses of SegmentingTokenizerBase don't have such issue because they don't 
do word segmentation in Tokenizer style, for example 
[WordSegmenter|https://github.com/apache/lucene-solr/blob/master/lucene/analysis/smartcn/src/java/org/apache/lucene/analysis/cn/smart/HMMChineseTokenizer.java#L46]
 in HMMChineseTokenizer. I think allowing I/O exception in *setNextSentence* 
and *incrementWord* ** will make users have more flexibility with the word 
segmentation and thus improve the usability of this class.

> Exceptions handling in methods of SegmentingTokenizerBase
> -
>
> Key: LUCENE-9588
> URL: https://issues.apache.org/jira/browse/LUCENE-9588
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/analysis
>Affects Versions: 8.6.3
>Reporter: Nguyen Minh Gia Huy
>Priority: Minor
>
> The current interface of *setNextSentence* and *incrementWord* methods in 
> *SegmentingTokenizerBase* do not define the checked exceptions, which makes 
> it troublesome to be inherited.
> For example, if we override the incrementWord  with a logic that invoke  
> incrementToken on another tokenizer, the incrementToken raises the 
> IOException but the incrementWord is not defined to handle it.
> I think having setNextSentence and incrementWord handle the IOException would 
> make the SegmentingTokenizerBase easier to be used.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

60 matches

Mail list logo