date:20191223

[jira] [Created] (LUCENE-9107) CommonsTermsQuery with huge no. of terms slower with top-k scoring

2019-12-23 Thread Tommaso Teofili (Jira)

Tommaso Teofili created LUCENE-9107:
---

 Summary: CommonsTermsQuery with huge no. of terms slower with 
top-k scoring
 Key: LUCENE-9107
 URL: https://issues.apache.org/jira/browse/LUCENE-9107
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/search
Affects Versions: 8.3
Reporter: Tommaso Teofili


In [1] a {{CommonTermsQuery}} is used in order to perform a query with lots of 
(duplicate) terms. Using a max term frequency cutoff of 0.999 for low frequency 
terms, the query, although big, finishes in around 2-300ms with Lucene 7.6.0. 
However, when upgrading the code to Lucene 8.x, the query runs in 2-3s instead.
After digging a bit into it it seems that the regression in speed comes from 
the fact that top-k scoring introduced by default in version 8 is causing that, 
not sure "where" exactly in the code though.
When switching back to complete hit scoring [3], the speed goes back to the 
initial 2-300ms also in Lucene 8.3.x.
I am looking into why this is happening and if it is only concerning 
{{CommonTermsQuery}} or affecting {BooleanQuery}} as well.

[1] : 
https://github.com/tteofili/Anserini-embeddings/blob/nnsearch/src/main/java/io/anserini/embeddings/nn/fw/FakeWordsRunner.java
[3] : 
https://github.com/tteofili/anserini/blob/ann-paper-reproduce/src/main/java/io/anserini/analysis/vectors/ApproximateNearestNeighborEval.java#L174



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-9107) CommonsTermsQuery with huge no. of terms slower with top-k scoring

2019-12-23 Thread Tommaso Teofili (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated LUCENE-9107:

Description: 
In [1] a {{CommonTermsQuery}} is used in order to perform a query with lots of 
(duplicate) terms. Using a max term frequency cutoff of 0.999 for low frequency 
terms, the query, although big, finishes in around 2-300ms with Lucene 7.6.0. 
However, when upgrading the code to Lucene 8.x, the query runs in 2-3s instead.
After digging a bit into it it seems that the regression in speed comes from 
the fact that top-k scoring introduced by default in version 8 is causing that, 
not sure "where" exactly in the code though.
When switching back to complete hit scoring [3], the speed goes back to the 
initial 2-300ms also in Lucene 8.3.x.
I am looking into why this is happening and if it is only concerning 
{{CommonTermsQuery}} or affecting {{BooleanQuery}} as well.

[1] : 
https://github.com/tteofili/Anserini-embeddings/blob/nnsearch/src/main/java/io/anserini/embeddings/nn/fw/FakeWordsRunner.java
[2] : 
https://github.com/castorini/anserini/blob/master/src/main/java/io/anserini/analysis/vectors/ApproximateNearestNeighborEval.java
[3] : 
https://github.com/tteofili/anserini/blob/ann-paper-reproduce/src/main/java/io/anserini/analysis/vectors/ApproximateNearestNeighborEval.java#L174

  was:
In [1] a {{CommonTermsQuery}} is used in order to perform a query with lots of 
(duplicate) terms. Using a max term frequency cutoff of 0.999 for low frequency 
terms, the query, although big, finishes in around 2-300ms with Lucene 7.6.0. 
However, when upgrading the code to Lucene 8.x, the query runs in 2-3s instead.
After digging a bit into it it seems that the regression in speed comes from 
the fact that top-k scoring introduced by default in version 8 is causing that, 
not sure "where" exactly in the code though.
When switching back to complete hit scoring [3], the speed goes back to the 
initial 2-300ms also in Lucene 8.3.x.
I am looking into why this is happening and if it is only concerning 
{{CommonTermsQuery}} or affecting {{BooleanQuery}} as well.

[1] : 
https://github.com/tteofili/Anserini-embeddings/blob/nnsearch/src/main/java/io/anserini/embeddings/nn/fw/FakeWordsRunner.java
[3] : 
https://github.com/tteofili/anserini/blob/ann-paper-reproduce/src/main/java/io/anserini/analysis/vectors/ApproximateNearestNeighborEval.java#L174


> CommonsTermsQuery with huge no. of terms slower with top-k scoring
> --
>
> Key: LUCENE-9107
> URL: https://issues.apache.org/jira/browse/LUCENE-9107
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 8.3
>Reporter: Tommaso Teofili
>Priority: Major
>
> In [1] a {{CommonTermsQuery}} is used in order to perform a query with lots 
> of (duplicate) terms. Using a max term frequency cutoff of 0.999 for low 
> frequency terms, the query, although big, finishes in around 2-300ms with 
> Lucene 7.6.0. 
> However, when upgrading the code to Lucene 8.x, the query runs in 2-3s 
> instead.
> After digging a bit into it it seems that the regression in speed comes from 
> the fact that top-k scoring introduced by default in version 8 is causing 
> that, not sure "where" exactly in the code though.
> When switching back to complete hit scoring [3], the speed goes back to the 
> initial 2-300ms also in Lucene 8.3.x.
> I am looking into why this is happening and if it is only concerning 
> {{CommonTermsQuery}} or affecting {{BooleanQuery}} as well.
> [1] : 
> https://github.com/tteofili/Anserini-embeddings/blob/nnsearch/src/main/java/io/anserini/embeddings/nn/fw/FakeWordsRunner.java
> [2] : 
> https://github.com/castorini/anserini/blob/master/src/main/java/io/anserini/analysis/vectors/ApproximateNearestNeighborEval.java
> [3] : 
> https://github.com/tteofili/anserini/blob/ann-paper-reproduce/src/main/java/io/anserini/analysis/vectors/ApproximateNearestNeighborEval.java#L174



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-9107) CommonsTermsQuery with huge no. of terms slower with top-k scoring

2019-12-23 Thread Tommaso Teofili (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated LUCENE-9107:

Description: 
In [1] a {{CommonTermsQuery}} is used in order to perform a query with lots of 
(duplicate) terms. Using a max term frequency cutoff of 0.999 for low frequency 
terms, the query, although big, finishes in around 2-300ms with Lucene 7.6.0. 
However, when upgrading the code to Lucene 8.x, the query runs in 2-3s instead.
After digging a bit into it it seems that the regression in speed comes from 
the fact that top-k scoring introduced by default in version 8 is causing that, 
not sure "where" exactly in the code though.
When switching back to complete hit scoring [3], the speed goes back to the 
initial 2-300ms also in Lucene 8.3.x.
I am looking into why this is happening and if it is only concerning 
{{CommonTermsQuery}} or affecting {{BooleanQuery}} as well.

[1] : 
https://github.com/tteofili/Anserini-embeddings/blob/nnsearch/src/main/java/io/anserini/embeddings/nn/fw/FakeWordsRunner.java
[3] : 
https://github.com/tteofili/anserini/blob/ann-paper-reproduce/src/main/java/io/anserini/analysis/vectors/ApproximateNearestNeighborEval.java#L174

  was:
In [1] a {{CommonTermsQuery}} is used in order to perform a query with lots of 
(duplicate) terms. Using a max term frequency cutoff of 0.999 for low frequency 
terms, the query, although big, finishes in around 2-300ms with Lucene 7.6.0. 
However, when upgrading the code to Lucene 8.x, the query runs in 2-3s instead.
After digging a bit into it it seems that the regression in speed comes from 
the fact that top-k scoring introduced by default in version 8 is causing that, 
not sure "where" exactly in the code though.
When switching back to complete hit scoring [3], the speed goes back to the 
initial 2-300ms also in Lucene 8.3.x.
I am looking into why this is happening and if it is only concerning 
{{CommonTermsQuery}} or affecting {BooleanQuery}} as well.

[1] : 
https://github.com/tteofili/Anserini-embeddings/blob/nnsearch/src/main/java/io/anserini/embeddings/nn/fw/FakeWordsRunner.java
[3] : 
https://github.com/tteofili/anserini/blob/ann-paper-reproduce/src/main/java/io/anserini/analysis/vectors/ApproximateNearestNeighborEval.java#L174


> CommonsTermsQuery with huge no. of terms slower with top-k scoring
> --
>
> Key: LUCENE-9107
> URL: https://issues.apache.org/jira/browse/LUCENE-9107
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 8.3
>Reporter: Tommaso Teofili
>Priority: Major
>
> In [1] a {{CommonTermsQuery}} is used in order to perform a query with lots 
> of (duplicate) terms. Using a max term frequency cutoff of 0.999 for low 
> frequency terms, the query, although big, finishes in around 2-300ms with 
> Lucene 7.6.0. 
> However, when upgrading the code to Lucene 8.x, the query runs in 2-3s 
> instead.
> After digging a bit into it it seems that the regression in speed comes from 
> the fact that top-k scoring introduced by default in version 8 is causing 
> that, not sure "where" exactly in the code though.
> When switching back to complete hit scoring [3], the speed goes back to the 
> initial 2-300ms also in Lucene 8.3.x.
> I am looking into why this is happening and if it is only concerning 
> {{CommonTermsQuery}} or affecting {{BooleanQuery}} as well.
> [1] : 
> https://github.com/tteofili/Anserini-embeddings/blob/nnsearch/src/main/java/io/anserini/embeddings/nn/fw/FakeWordsRunner.java
> [3] : 
> https://github.com/tteofili/anserini/blob/ann-paper-reproduce/src/main/java/io/anserini/analysis/vectors/ApproximateNearestNeighborEval.java#L174



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-9107) CommonsTermsQuery with huge no. of terms slower with top-k scoring

2019-12-23 Thread Tommaso Teofili (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated LUCENE-9107:

Description: 
In [1] a {{CommonTermsQuery}} is used in order to perform a query with lots of 
(duplicate) terms. Using a max term frequency cutoff of 0.999 for low frequency 
terms, the query, although big, finishes in around 2-300ms with Lucene 7.6.0. 
However, when upgrading the code to Lucene 8.x, the query runs in 2-3s instead 
[2].
After digging a bit into it it seems that the regression in speed comes from 
the fact that top-k scoring introduced by default in version 8 is causing that, 
not sure "where" exactly in the code though.
When switching back to complete hit scoring [3], the speed goes back to the 
initial 2-300ms also in Lucene 8.3.x.
I am looking into why this is happening and if it is only concerning 
{{CommonTermsQuery}} or affecting {{BooleanQuery}} as well.

[1] : 
https://github.com/tteofili/Anserini-embeddings/blob/nnsearch/src/main/java/io/anserini/embeddings/nn/fw/FakeWordsRunner.java
[2] : 
https://github.com/castorini/anserini/blob/master/src/main/java/io/anserini/analysis/vectors/ApproximateNearestNeighborEval.java
[3] : 
https://github.com/tteofili/anserini/blob/ann-paper-reproduce/src/main/java/io/anserini/analysis/vectors/ApproximateNearestNeighborEval.java#L174

  was:
In [1] a {{CommonTermsQuery}} is used in order to perform a query with lots of 
(duplicate) terms. Using a max term frequency cutoff of 0.999 for low frequency 
terms, the query, although big, finishes in around 2-300ms with Lucene 7.6.0. 
However, when upgrading the code to Lucene 8.x, the query runs in 2-3s instead.
After digging a bit into it it seems that the regression in speed comes from 
the fact that top-k scoring introduced by default in version 8 is causing that, 
not sure "where" exactly in the code though.
When switching back to complete hit scoring [3], the speed goes back to the 
initial 2-300ms also in Lucene 8.3.x.
I am looking into why this is happening and if it is only concerning 
{{CommonTermsQuery}} or affecting {{BooleanQuery}} as well.

[1] : 
https://github.com/tteofili/Anserini-embeddings/blob/nnsearch/src/main/java/io/anserini/embeddings/nn/fw/FakeWordsRunner.java
[2] : 
https://github.com/castorini/anserini/blob/master/src/main/java/io/anserini/analysis/vectors/ApproximateNearestNeighborEval.java
[3] : 
https://github.com/tteofili/anserini/blob/ann-paper-reproduce/src/main/java/io/anserini/analysis/vectors/ApproximateNearestNeighborEval.java#L174


> CommonsTermsQuery with huge no. of terms slower with top-k scoring
> --
>
> Key: LUCENE-9107
> URL: https://issues.apache.org/jira/browse/LUCENE-9107
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 8.3
>Reporter: Tommaso Teofili
>Priority: Major
>
> In [1] a {{CommonTermsQuery}} is used in order to perform a query with lots 
> of (duplicate) terms. Using a max term frequency cutoff of 0.999 for low 
> frequency terms, the query, although big, finishes in around 2-300ms with 
> Lucene 7.6.0. 
> However, when upgrading the code to Lucene 8.x, the query runs in 2-3s 
> instead [2].
> After digging a bit into it it seems that the regression in speed comes from 
> the fact that top-k scoring introduced by default in version 8 is causing 
> that, not sure "where" exactly in the code though.
> When switching back to complete hit scoring [3], the speed goes back to the 
> initial 2-300ms also in Lucene 8.3.x.
> I am looking into why this is happening and if it is only concerning 
> {{CommonTermsQuery}} or affecting {{BooleanQuery}} as well.
> [1] : 
> https://github.com/tteofili/Anserini-embeddings/blob/nnsearch/src/main/java/io/anserini/embeddings/nn/fw/FakeWordsRunner.java
> [2] : 
> https://github.com/castorini/anserini/blob/master/src/main/java/io/anserini/analysis/vectors/ApproximateNearestNeighborEval.java
> [3] : 
> https://github.com/tteofili/anserini/blob/ann-paper-reproduce/src/main/java/io/anserini/analysis/vectors/ApproximateNearestNeighborEval.java#L174



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Assigned] (LUCENE-9102) Add maxQueryLength option to DirectSpellchecker

2019-12-23 Thread Bruno Roustant (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Roustant reassigned LUCENE-9102:
--

Assignee: Bruno Roustant

> Add maxQueryLength option to DirectSpellchecker
> ---
>
> Key: LUCENE-9102
> URL: https://issues.apache.org/jira/browse/LUCENE-9102
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/spellchecker
>Reporter: Andy Webb
>Assignee: Bruno Roustant
>Priority: Minor
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Attempting to spellcheck some long query terms can trigger 
> {{org.apache.lucene.util.automaton.TooComplexToDeterminizeException}}. This 
> change (previously discussed in SOLR-13190) adds a {{maxQueryLength}} option 
> to {{DirectSpellchecker}} so that Lucene can be configured to not attempt to 
> spellcheck terms over a specified length.
> PR: https://github.com/apache/lucene-solr/pull/1103
> Dependent Solr issue: SOLR-14131



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] asfgit closed pull request #1103: LUCENE-9102: add maxQueryLength option to DirectSpellChecker

2019-12-23 Thread GitBox

asfgit closed pull request #1103: LUCENE-9102: add maxQueryLength option to 
DirectSpellChecker
URL: https://github.com/apache/lucene-solr/pull/1103
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9102) Add maxQueryLength option to DirectSpellchecker

2019-12-23 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002195#comment-17002195
 ] 

ASF subversion and git services commented on LUCENE-9102:
-

Commit 45dce3431688b3e3094b02f8dc824183b055c212 in lucene-solr's branch 
refs/heads/master from Andy Webb
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=45dce34 ]

LUCENE-9102: Add maxQueryLength option to DirectSpellchecker.

Closes #1103


> Add maxQueryLength option to DirectSpellchecker
> ---
>
> Key: LUCENE-9102
> URL: https://issues.apache.org/jira/browse/LUCENE-9102
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/spellchecker
>Reporter: Andy Webb
>Assignee: Bruno Roustant
>Priority: Minor
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Attempting to spellcheck some long query terms can trigger 
> {{org.apache.lucene.util.automaton.TooComplexToDeterminizeException}}. This 
> change (previously discussed in SOLR-13190) adds a {{maxQueryLength}} option 
> to {{DirectSpellchecker}} so that Lucene can be configured to not attempt to 
> spellcheck terms over a specified length.
> PR: https://github.com/apache/lucene-solr/pull/1103
> Dependent Solr issue: SOLR-14131



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13778) Windows JDK SSL Test Failure trend: SSLException: Software caused connection abort: recv failed

2019-12-23 Thread Uwe Schindler (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002197#comment-17002197
 ] 

Uwe Schindler commented on SOLR-13778:
--

Hi, about the JDK error handling I opened 
[https://bugs.openjdk.java.net/browse/JDK-8236498] on behalf of [~dweiss]. 
Thanks Dawid!

> Windows JDK SSL Test Failure trend: SSLException: Software caused connection 
> abort: recv failed
> ---
>
> Key: SOLR-13778
> URL: https://issues.apache.org/jira/browse/SOLR-13778
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Priority: Major
> Attachments: RecvFailedTest.java, dumps-LegacyCloud.zip, 
> logs-2019-12-12-1.zip, recv-multiple-2019-12-18.zip
>
>
> Now that Uwe's jenkins build has been correctly reporting it's build results 
> for my [automated 
> reports|http://fucit.org/solr-jenkins-reports/failure-report.html] to pick 
> up, I've noticed a pattern of failures that indicate a definite problem with 
> using SSL on Windows (even with java 11.0.4
>  )
>  The symptommatic stack traces all contain...
> {noformat}
> ...
>[junit4]> Caused by: javax.net.ssl.SSLException: Software caused 
> connection abort: recv failed
>[junit4]>at 
> java.base/sun.security.ssl.Alert.createSSLException(Alert.java:127)
> ...
>[junit4]> Caused by: java.net.SocketException: Software caused 
> connection abort: recv failed
>[junit4]>at 
> java.base/java.net.SocketInputStream.socketRead0(Native Method)
> ...
> {noformat}
> I suspect this may be related to 
> [https://bugs.openjdk.java.net/browse/JDK-8209333] but i have no concrete 
> evidence to back this up.
> I'll post some details of my analysis in comments...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9102) Add maxQueryLength option to DirectSpellchecker

2019-12-23 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002236#comment-17002236
 ] 

ASF subversion and git services commented on LUCENE-9102:
-

Commit ab1dc42c63a77162a2cf4ea6985364583e07bdc5 in lucene-solr's branch 
refs/heads/branch_8x from Bruno Roustant
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=ab1dc42 ]

LUCENE-9102: Add maxQueryLength option to DirectSpellchecker.


> Add maxQueryLength option to DirectSpellchecker
> ---
>
> Key: LUCENE-9102
> URL: https://issues.apache.org/jira/browse/LUCENE-9102
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/spellchecker
>Reporter: Andy Webb
>Assignee: Bruno Roustant
>Priority: Minor
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Attempting to spellcheck some long query terms can trigger 
> {{org.apache.lucene.util.automaton.TooComplexToDeterminizeException}}. This 
> change (previously discussed in SOLR-13190) adds a {{maxQueryLength}} option 
> to {{DirectSpellchecker}} so that Lucene can be configured to not attempt to 
> spellcheck terms over a specified length.
> PR: https://github.com/apache/lucene-solr/pull/1103
> Dependent Solr issue: SOLR-14131



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-12490) Query DSL supports for further referring and exclusion in JSON facets

2019-12-23 Thread Mikhail Khludnev (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16926538#comment-16926538
 ] 

Mikhail Khludnev edited comment on SOLR-12490 at 12/23/19 1:32 PM:
---

-I think we'd rather continue with adding yet another small cut.- 
{code:java}
{
"query" : {...}, 
"params":{
 "childFq":[{ "#color" :"color:black" },
{ "#size" : "size:L" }]
},
"facet":{
   "sku_colors_in_prods":{ "type" : "terms", "field" : "color",
  "domain" : {
   "excludeTags":["top",   "color"],   
   "filter":[ 
  "{!json_param}childFq"  
   ]
   }
   }
}
}
{code}
-Ideas are:-
 * -put json as param value, parser garbles it to meaningless string, but it's 
still available via {{req.getJSON()}}.-
 * -filter string invokes new query parser which convert json param as query 
DSL, need to decide how to keep {{JsonQueryConverter}} counter.-

-Shouldn't be a big deal. Right?-


was (Author: mkhludnev):
I think we'd rather continue with adding yet another small cut. 

{code}
{
"query" : {...}, 
"params":{
 "childFq":[{ "#color" :"color:black" },
{ "#size" : "size:L" }]
},
"facet":{
   "sku_colors_in_prods":{ "type" : "terms", "field" : "color",
  "domain" : {
   "excludeTags":["top",   "color"],   
   "filter":[ 
  "{!json_param}childFq"  
   ]
   }
   }
}
}
{code}

Ideas are: 
* put json as param value, parser garbles it to meaningless string, but it's 
still available via {{req.getJSON()}}. 
* filter string invokes new query parser which convert json param as query DSL, 
need to decide how to keep {{JsonQueryConverter}} counter.   

Shouldn't be a big deal. Right?  

> Query DSL supports for further referring and exclusion in JSON facets 
> --
>
> Key: SOLR-12490
> URL: https://issues.apache.org/jira/browse/SOLR-12490
> Project: Solr
>  Issue Type: Improvement
>  Components: Facet Module, faceting
>Reporter: Mikhail Khludnev
>Priority: Major
>  Labels: newdev
> Attachments: SOLR-12490.patch, SOLR-12490.patch, 
> image-2019-10-21-09-37-37-118.png
>
>
> It's spin off from the 
> [discussion|https://issues.apache.org/jira/browse/SOLR-9685?focusedCommentId=16508720&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16508720].
>  
> h2. Problem
> # after SOLR-9685 we can tag separate clauses in hairish queries like 
> {{parent}}, {{bool}}
> # we can {{domain.excludeTags}}
> # we are looking for child faceting with exclusions, see SOLR-9510, SOLR-8998 
>
> # but we can refer only separate params in {{domain.filter}}, it's not 
> possible to refer separate clauses
> see the first comment



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-12490) Query DSL supports for further referring and exclusion in JSON facets

2019-12-23 Thread Mikhail Khludnev (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16948013#comment-16948013
 ] 

Mikhail Khludnev edited comment on SOLR-12490 at 12/23/19 1:32 PM:
---

-Handling array as in the snippet above too cumbersome. Here's the nobrainer 
approach. Attaching rough cut.-  
{code:java}
{
"query" : {...}, 
"params":{
 "color_fq":{ "#color" :"color:black" },
 "size_fq": { "#size" : "size:L" }
},
"facet":{
   "sku_colors_in_prods":{ "type" : "terms", "field" : "color",
  "domain" : {
   "excludeTags":["top",   "color"],   
   "filter":[ 
  "{!json_param}color_fq",   
  "{!json_param}size_fq"
   ]
   }
   }
}
}
{code}
-Opinions?-


was (Author: mkhludnev):
-Handling array as in the snippet above too cumbersome. Here's the nobrainer 
approach. Attaching rough cut.-  
{code:java}
{
"query" : {...}, 
"params":{
 "color_fq":{ "#color" :"color:black" },
 "size_fq": { "#size" : "size:L" }
},
"facet":{
   "sku_colors_in_prods":{ "type" : "terms", "field" : "color",
  "domain" : {
   "excludeTags":["top",   "color"],   
   "filter":[ 
  "{!json_param}color_fq",   
  "{!json_param}size_fq"
   ]
   }
   }
}
}
{code}
Opinions?

> Query DSL supports for further referring and exclusion in JSON facets 
> --
>
> Key: SOLR-12490
> URL: https://issues.apache.org/jira/browse/SOLR-12490
> Project: Solr
>  Issue Type: Improvement
>  Components: Facet Module, faceting
>Reporter: Mikhail Khludnev
>Priority: Major
>  Labels: newdev
> Attachments: SOLR-12490.patch, SOLR-12490.patch, 
> image-2019-10-21-09-37-37-118.png
>
>
> It's spin off from the 
> [discussion|https://issues.apache.org/jira/browse/SOLR-9685?focusedCommentId=16508720&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16508720].
>  
> h2. Problem
> # after SOLR-9685 we can tag separate clauses in hairish queries like 
> {{parent}}, {{bool}}
> # we can {{domain.excludeTags}}
> # we are looking for child faceting with exclusions, see SOLR-9510, SOLR-8998 
>
> # but we can refer only separate params in {{domain.filter}}, it's not 
> possible to refer separate clauses
> see the first comment



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-12490) Query DSL supports for further referring and exclusion in JSON facets

2019-12-23 Thread Mikhail Khludnev (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16948013#comment-16948013
 ] 

Mikhail Khludnev edited comment on SOLR-12490 at 12/23/19 1:32 PM:
---

-Handling array as in the snippet above too cumbersome. Here's the nobrainer 
approach. Attaching rough cut.-  
{code:java}
{
"query" : {...}, 
"params":{
 "color_fq":{ "#color" :"color:black" },
 "size_fq": { "#size" : "size:L" }
},
"facet":{
   "sku_colors_in_prods":{ "type" : "terms", "field" : "color",
  "domain" : {
   "excludeTags":["top",   "color"],   
   "filter":[ 
  "{!json_param}color_fq",   
  "{!json_param}size_fq"
   ]
   }
   }
}
}
{code}
Opinions?


was (Author: mkhludnev):
Handling array as in the snippet above too cumbersome. Here's the nobrainer 
approach. Attaching rough cut.  
{code:java}
{
"query" : {...}, 
"params":{
 "color_fq":{ "#color" :"color:black" },
 "size_fq": { "#size" : "size:L" }
},
"facet":{
   "sku_colors_in_prods":{ "type" : "terms", "field" : "color",
  "domain" : {
   "excludeTags":["top",   "color"],   
   "filter":[ 
  "{!json_param}color_fq",   
  "{!json_param}size_fq"
   ]
   }
   }
}
}
{code}
Opinions?

> Query DSL supports for further referring and exclusion in JSON facets 
> --
>
> Key: SOLR-12490
> URL: https://issues.apache.org/jira/browse/SOLR-12490
> Project: Solr
>  Issue Type: Improvement
>  Components: Facet Module, faceting
>Reporter: Mikhail Khludnev
>Priority: Major
>  Labels: newdev
> Attachments: SOLR-12490.patch, SOLR-12490.patch, 
> image-2019-10-21-09-37-37-118.png
>
>
> It's spin off from the 
> [discussion|https://issues.apache.org/jira/browse/SOLR-9685?focusedCommentId=16508720&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16508720].
>  
> h2. Problem
> # after SOLR-9685 we can tag separate clauses in hairish queries like 
> {{parent}}, {{bool}}
> # we can {{domain.excludeTags}}
> # we are looking for child faceting with exclusions, see SOLR-9510, SOLR-8998 
>
> # but we can refer only separate params in {{domain.filter}}, it's not 
> possible to refer separate clauses
> see the first comment



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-12490) Query DSL supports for further referring and exclusion in JSON facets

2019-12-23 Thread Mikhail Khludnev (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955801#comment-16955801
 ] 

Mikhail Khludnev edited comment on SOLR-12490 at 12/23/19 1:33 PM:
---

-Renamed it \{!json} to avoid clash with nestedQP. Here how it looks now:-

!image-2019-10-21-09-37-37-118.png!

 


was (Author: mkhludnev):
Renamed it \{!json} to avoid clash with nestedQP. Here how it looks now:

!image-2019-10-21-09-37-37-118.png!

 

> Query DSL supports for further referring and exclusion in JSON facets 
> --
>
> Key: SOLR-12490
> URL: https://issues.apache.org/jira/browse/SOLR-12490
> Project: Solr
>  Issue Type: Improvement
>  Components: Facet Module, faceting
>Reporter: Mikhail Khludnev
>Priority: Major
>  Labels: newdev
> Attachments: SOLR-12490.patch, SOLR-12490.patch, 
> image-2019-10-21-09-37-37-118.png
>
>
> It's spin off from the 
> [discussion|https://issues.apache.org/jira/browse/SOLR-9685?focusedCommentId=16508720&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16508720].
>  
> h2. Problem
> # after SOLR-9685 we can tag separate clauses in hairish queries like 
> {{parent}}, {{bool}}
> # we can {{domain.excludeTags}}
> # we are looking for child faceting with exclusions, see SOLR-9510, SOLR-8998 
>
> # but we can refer only separate params in {{domain.filter}}, it's not 
> possible to refer separate clauses
> see the first comment



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] andywebb1975 opened a new pull request #1112: SOLR-14131: add maxQueryLength option

2019-12-23 Thread GitBox

andywebb1975 opened a new pull request #1112: SOLR-14131: add maxQueryLength 
option
URL: https://github.com/apache/lucene-solr/pull/1112
 
 
   This is a work-in-progress - I'm trying to get tests working.
   
   # Description
   
   Please provide a short description of the changes you're making with this 
pull request.
   
   # Solution
   
   Please provide a short description of the approach taken to implement your 
solution.
   
   # Tests
   
   Please describe the tests you've developed or run to confirm this patch 
implements the feature or solves the problem.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [ ] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [ ] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [ ] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [ ] I have developed this patch against the `master` branch.
   - [ ] I have run `ant precommit` and the appropriate test suite.
   - [ ] I have added tests for my changes.
   - [ ] I have added documentation for the [Ref 
Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) 
(for Solr changes only).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] andywebb1975 closed pull request #1107: SOLR-14131: add maxQueryLength option to DirectSolrSpellChecker

2019-12-23 Thread GitBox

andywebb1975 closed pull request #1107: SOLR-14131: add maxQueryLength option 
to DirectSolrSpellChecker
URL: https://github.com/apache/lucene-solr/pull/1107
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14131) Add maxQueryLength option to DirectSolrSpellchecker

2019-12-23 Thread Andy Webb (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Webb updated SOLR-14131:
-
Description: 
Attempting to spellcheck some long query terms can trigger 
org.apache.lucene.util.automaton.TooComplexToDeterminizeException. This change 
(previously discussed in SOLR-13190, and dependent on LUCENE-9102) adds a 
maxQueryLength option to DirectSolrSpellChecker so that Lucene/Solr can be 
configured to not attempt to spellcheck terms over a specified length.

Here's a draft PR: https://github.com/apache/lucene-solr/pull/1112 (I'm 
struggling writing tests, and we should update the Solr docs too.)


  was:Attempting to spellcheck some long query terms can trigger 
org.apache.lucene.util.automaton.TooComplexToDeterminizeException. This change 
(previously discussed in SOLR-13190, and dependent on LUCENE-9102) adds a 
maxQueryLength option to DirectSolrSpellchecker so that Lucene/Solr can be 
configured to not attempt to spellcheck terms over a specified length.


> Add maxQueryLength option to DirectSolrSpellchecker
> ---
>
> Key: SOLR-14131
> URL: https://issues.apache.org/jira/browse/SOLR-14131
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Reporter: Andy Webb
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Attempting to spellcheck some long query terms can trigger 
> org.apache.lucene.util.automaton.TooComplexToDeterminizeException. This 
> change (previously discussed in SOLR-13190, and dependent on LUCENE-9102) 
> adds a maxQueryLength option to DirectSolrSpellChecker so that Lucene/Solr 
> can be configured to not attempt to spellcheck terms over a specified length.
> Here's a draft PR: https://github.com/apache/lucene-solr/pull/1112 (I'm 
> struggling writing tests, and we should update the Solr docs too.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] andywebb1975 closed pull request #1112: SOLR-14131: add maxQueryLength option

2019-12-23 Thread GitBox

andywebb1975 closed pull request #1112: SOLR-14131: add maxQueryLength option
URL: https://github.com/apache/lucene-solr/pull/1112
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9108) eliminate JKS keystore from solr SSL docs

2019-12-23 Thread Robert Muir (Jira)

Robert Muir created LUCENE-9108:
---

 Summary: eliminate JKS keystore from solr SSL docs
 Key: LUCENE-9108
 URL: https://issues.apache.org/jira/browse/LUCENE-9108
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Robert Muir


On the "Enabling SSL" page: 
https://lucene.apache.org/solr/guide/8_3/enabling-ssl.html#enabling-ssl

The first step is currently to create a JKS keystore. The next step immediately 
converts the JKS keystore into PKCS12, so that openssl can then be used to 
extract key material in PEM format for use with curl.

Now that PKCS12 is java's default keystore format, why not omit step 1 
entirely? What am I missing? PKCS12 is a more commonly understood/standardized 
format.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Moved] (SOLR-14141) eliminate JKS keystore from solr SSL docs

2019-12-23 Thread Robert Muir (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir moved LUCENE-9108 to SOLR-14141:


  Key: SOLR-14141  (was: LUCENE-9108)
Lucene Fields:   (was: New)
  Project: Solr  (was: Lucene - Core)
 Security: Public

> eliminate JKS keystore from solr SSL docs
> -
>
> Key: SOLR-14141
> URL: https://issues.apache.org/jira/browse/SOLR-14141
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Robert Muir
>Priority: Major
>
> On the "Enabling SSL" page: 
> https://lucene.apache.org/solr/guide/8_3/enabling-ssl.html#enabling-ssl
> The first step is currently to create a JKS keystore. The next step 
> immediately converts the JKS keystore into PKCS12, so that openssl can then 
> be used to extract key material in PEM format for use with curl.
> Now that PKCS12 is java's default keystore format, why not omit step 1 
> entirely? What am I missing? PKCS12 is a more commonly 
> understood/standardized format.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] rmuir merged pull request #1110: SOLR-14138: enable request log via environ var

2019-12-23 Thread GitBox

rmuir merged pull request #1110: SOLR-14138: enable request log via environ var
URL: https://github.com/apache/lucene-solr/pull/1110
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14138) Fix commented-out RequestLog in jetty.xml to use non-deprecated class

2019-12-23 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002348#comment-17002348
 ] 

ASF subversion and git services commented on SOLR-14138:


Commit 1425d6cbf853a8ab8998f95b6982c065d9bac1c7 in lucene-solr's branch 
refs/heads/master from Robert Muir
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=1425d6c ]

SOLR-14138: enable request log via environ var, remove deprecated jetty class 
usage, respect SOLR_LOGS_DIR (#1110)

User can now set SOLR_REQUESTLOG_ENABLED=true to enable the jetty request log, 
instead of editing XML. The location of the request logs will respect 
SOLR_LOGS_DIR if that is set. The deprecated NCSARequestLog is no longer used, 
instead it uses CustomRequestLog with NCSA_FORMAT.


> Fix commented-out RequestLog in jetty.xml to use non-deprecated class
> -
>
> Key: SOLR-14138
> URL: https://issues.apache.org/jira/browse/SOLR-14138
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Robert Muir
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently the jetty request logging is disabled (commented out). 
> But it can be useful, e.g. since it uses a standard logging format and there 
> are tools to analyze it by default. Also it can be used to detect some 
> attacks not otherwise logged anywhere else, since they don't make it to solr 
> servlet: requests blocked at the jetty level (invalid/malformed requests, 
> ones filtered by jetty IP filtering, etc).
> We should switch it from the deprecated NCSARequestLog class, instead to use 
> the CustomRequestLog with either NCSA_FORMAT or EXTENDED_NCSA_FORMAT.
> {quote}
> Deprecated.
> use CustomRequestLog given format string 
> CustomRequestLog.EXTENDED_NCSA_FORMAT with a RequestLogWriter
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14138) Fix commented-out RequestLog in jetty.xml to use non-deprecated class

2019-12-23 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002350#comment-17002350
 ] 

ASF subversion and git services commented on SOLR-14138:


Commit baeaa56fb27efe41b9c41d35a93d086b2a9d7cb4 in lucene-solr's branch 
refs/heads/branch_8x from Robert Muir
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=baeaa56 ]

SOLR-14138: enable request log via environ var, remove deprecated jetty class 
usage, respect SOLR_LOGS_DIR (#1110)

User can now set SOLR_REQUESTLOG_ENABLED=true to enable the jetty request log, 
instead of editing XML. The location of the request logs will respect 
SOLR_LOGS_DIR if that is set. The deprecated NCSARequestLog is no longer used, 
instead it uses CustomRequestLog with NCSA_FORMAT.


> Fix commented-out RequestLog in jetty.xml to use non-deprecated class
> -
>
> Key: SOLR-14138
> URL: https://issues.apache.org/jira/browse/SOLR-14138
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Robert Muir
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently the jetty request logging is disabled (commented out). 
> But it can be useful, e.g. since it uses a standard logging format and there 
> are tools to analyze it by default. Also it can be used to detect some 
> attacks not otherwise logged anywhere else, since they don't make it to solr 
> servlet: requests blocked at the jetty level (invalid/malformed requests, 
> ones filtered by jetty IP filtering, etc).
> We should switch it from the deprecated NCSARequestLog class, instead to use 
> the CustomRequestLog with either NCSA_FORMAT or EXTENDED_NCSA_FORMAT.
> {quote}
> Deprecated.
> use CustomRequestLog given format string 
> CustomRequestLog.EXTENDED_NCSA_FORMAT with a RequestLogWriter
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (SOLR-14138) Fix commented-out RequestLog in jetty.xml to use non-deprecated class

2019-12-23 Thread Robert Muir (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved SOLR-14138.

  Assignee: Robert Muir
Resolution: Fixed

> Fix commented-out RequestLog in jetty.xml to use non-deprecated class
> -
>
> Key: SOLR-14138
> URL: https://issues.apache.org/jira/browse/SOLR-14138
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Major
> Fix For: 8.5
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently the jetty request logging is disabled (commented out). 
> But it can be useful, e.g. since it uses a standard logging format and there 
> are tools to analyze it by default. Also it can be used to detect some 
> attacks not otherwise logged anywhere else, since they don't make it to solr 
> servlet: requests blocked at the jetty level (invalid/malformed requests, 
> ones filtered by jetty IP filtering, etc).
> We should switch it from the deprecated NCSARequestLog class, instead to use 
> the CustomRequestLog with either NCSA_FORMAT or EXTENDED_NCSA_FORMAT.
> {quote}
> Deprecated.
> use CustomRequestLog given format string 
> CustomRequestLog.EXTENDED_NCSA_FORMAT with a RequestLogWriter
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14138) Fix commented-out RequestLog in jetty.xml to use non-deprecated class

2019-12-23 Thread Robert Muir (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated SOLR-14138:
---
Fix Version/s: 8.5

> Fix commented-out RequestLog in jetty.xml to use non-deprecated class
> -
>
> Key: SOLR-14138
> URL: https://issues.apache.org/jira/browse/SOLR-14138
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Robert Muir
>Priority: Major
> Fix For: 8.5
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently the jetty request logging is disabled (commented out). 
> But it can be useful, e.g. since it uses a standard logging format and there 
> are tools to analyze it by default. Also it can be used to detect some 
> attacks not otherwise logged anywhere else, since they don't make it to solr 
> servlet: requests blocked at the jetty level (invalid/malformed requests, 
> ones filtered by jetty IP filtering, etc).
> We should switch it from the deprecated NCSARequestLog class, instead to use 
> the CustomRequestLog with either NCSA_FORMAT or EXTENDED_NCSA_FORMAT.
> {quote}
> Deprecated.
> use CustomRequestLog given format string 
> CustomRequestLog.EXTENDED_NCSA_FORMAT with a RequestLogWriter
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-14142) Enable jetty's request log by default

2019-12-23 Thread Robert Muir (Jira)

Robert Muir created SOLR-14142:
--

 Summary: Enable jetty's request log by default
 Key: SOLR-14142
 URL: https://issues.apache.org/jira/browse/SOLR-14142
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Robert Muir


I'd like to enable the jetty request log by default.

This log is now in the correct directory, it no longer uses the deprecated 
mechanisms (it is asynclogwriter + customformat), etc. See SOLR-14138.

This log is in a standard format (NCSA) which is supported by tools out-of-box. 
It does not contain challenges such as java exceptions and is easy to work 
with. Without it enabled, solr really has insufficient logging (e.g. no IP 
addresses).

If someone's solr gets hacked, its only fair they at least get to see who did 
it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] andywebb1975 closed pull request #1098: SOLR-13190 - added maxQueryLength parameter

2019-12-23 Thread GitBox

andywebb1975 closed pull request #1098: SOLR-13190 - added maxQueryLength 
parameter
URL: https://github.com/apache/lucene-solr/pull/1098
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] andywebb1975 opened a new pull request #1113: SOLR-14131: adds maxQueryLength option

2019-12-23 Thread GitBox

andywebb1975 opened a new pull request #1113: SOLR-14131: adds maxQueryLength 
option
URL: https://github.com/apache/lucene-solr/pull/1113
 
 
   # Description
   
   Attempting to spellcheck some long query terms can trigger 
org.apache.lucene.util.automaton.TooComplexToDeterminizeException.
   
   # Solution
   
   This change (previously discussed in SOLR-13190, and dependent on 
LUCENE-9102) adds a maxQueryLength option to DirectSolrSpellChecker so that 
Lucene/Solr can be configured to not attempt to spellcheck terms over a 
specified length.
   
   # Tests
   
   A new test checks that a term is spellchecked before maxQueryLength is 
reduced, and not afterwards.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [x] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [x] I have developed this patch against the `master` branch.
   - [ ] I have run `ant precommit` and the appropriate test suite.
   - [x] I have added tests for my changes.
   - [x] I have added documentation for the [Ref 
Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) 
(for Solr changes only).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9102) Add maxQueryLength option to DirectSpellchecker

2019-12-23 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002356#comment-17002356
 ] 

ASF subversion and git services commented on LUCENE-9102:
-

Commit 663bfe2d8b5b4996806d4fcf4cc09ea12be45464 in lucene-solr's branch 
refs/heads/master from Bruno Roustant
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=663bfe2 ]

LUCENE-9102: update changes.txt


> Add maxQueryLength option to DirectSpellchecker
> ---
>
> Key: LUCENE-9102
> URL: https://issues.apache.org/jira/browse/LUCENE-9102
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/spellchecker
>Reporter: Andy Webb
>Assignee: Bruno Roustant
>Priority: Minor
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Attempting to spellcheck some long query terms can trigger 
> {{org.apache.lucene.util.automaton.TooComplexToDeterminizeException}}. This 
> change (previously discussed in SOLR-13190) adds a {{maxQueryLength}} option 
> to {{DirectSpellchecker}} so that Lucene can be configured to not attempt to 
> spellcheck terms over a specified length.
> PR: https://github.com/apache/lucene-solr/pull/1103
> Dependent Solr issue: SOLR-14131



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14131) Add maxQueryLength option to DirectSolrSpellchecker

2019-12-23 Thread Andy Webb (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Webb updated SOLR-14131:
-
Description: 
Attempting to spellcheck some long query terms can trigger 
org.apache.lucene.util.automaton.TooComplexToDeterminizeException. This change 
(previously discussed in SOLR-13190, and dependent on LUCENE-9102) adds a 
maxQueryLength option to DirectSolrSpellChecker so that Lucene/Solr can be 
configured to not attempt to spellcheck terms over a specified length.

Here's a PR: https://github.com/apache/lucene-solr/pull/1113 


  was:
Attempting to spellcheck some long query terms can trigger 
org.apache.lucene.util.automaton.TooComplexToDeterminizeException. This change 
(previously discussed in SOLR-13190, and dependent on LUCENE-9102) adds a 
maxQueryLength option to DirectSolrSpellChecker so that Lucene/Solr can be 
configured to not attempt to spellcheck terms over a specified length.

Here's a draft PR: https://github.com/apache/lucene-solr/pull/1112 (I'm 
struggling writing tests, and we should update the Solr docs too.)



> Add maxQueryLength option to DirectSolrSpellchecker
> ---
>
> Key: SOLR-14131
> URL: https://issues.apache.org/jira/browse/SOLR-14131
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Reporter: Andy Webb
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Attempting to spellcheck some long query terms can trigger 
> org.apache.lucene.util.automaton.TooComplexToDeterminizeException. This 
> change (previously discussed in SOLR-13190, and dependent on LUCENE-9102) 
> adds a maxQueryLength option to DirectSolrSpellChecker so that Lucene/Solr 
> can be configured to not attempt to spellcheck terms over a specified length.
> Here's a PR: https://github.com/apache/lucene-solr/pull/1113 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] andywebb1975 commented on a change in pull request #1113: SOLR-14131: adds maxQueryLength option

2019-12-23 Thread GitBox

andywebb1975 commented on a change in pull request #1113: SOLR-14131: adds 
maxQueryLength option
URL: https://github.com/apache/lucene-solr/pull/1113#discussion_r360930320
 
 

 ##
 File path: 
solr/core/src/test/org/apache/solr/spelling/DirectSolrSpellCheckerTest.java
 ##
 @@ -79,7 +79,7 @@ public void test() throws Exception {
   return null;
 });
   }
-  
+
 
 Review comment:
   It's not clear to me what the "super" test above does. As far as I can see, 
the test runs a spellcheck for "super" but then uses "fob" as the index into 
suggestions, which will never find an entry.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] andywebb1975 commented on a change in pull request #1113: SOLR-14131: adds maxQueryLength option

2019-12-23 Thread GitBox

andywebb1975 commented on a change in pull request #1113: SOLR-14131: adds 
maxQueryLength option
URL: https://github.com/apache/lucene-solr/pull/1113#discussion_r360930320
 
 

 ##
 File path: 
solr/core/src/test/org/apache/solr/spelling/DirectSolrSpellCheckerTest.java
 ##
 @@ -79,7 +79,7 @@ public void test() throws Exception {
   return null;
 });
   }
-  
+
 
 Review comment:
   It's not clear to me what the "super" test above is for. As far as I can 
see, the test runs a spellcheck for "super" but then uses "fob" as the index 
into suggestions, which will never find an entry.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9102) Add maxQueryLength option to DirectSpellchecker

2019-12-23 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002359#comment-17002359
 ] 

ASF subversion and git services commented on LUCENE-9102:
-

Commit 361bf78d899433730210bcec1b775b74cbb71664 in lucene-solr's branch 
refs/heads/branch_8x from Bruno Roustant
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=361bf78 ]

LUCENE-9102: update changes.txt


> Add maxQueryLength option to DirectSpellchecker
> ---
>
> Key: LUCENE-9102
> URL: https://issues.apache.org/jira/browse/LUCENE-9102
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/spellchecker
>Reporter: Andy Webb
>Assignee: Bruno Roustant
>Priority: Minor
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Attempting to spellcheck some long query terms can trigger 
> {{org.apache.lucene.util.automaton.TooComplexToDeterminizeException}}. This 
> change (previously discussed in SOLR-13190) adds a {{maxQueryLength}} option 
> to {{DirectSpellchecker}} so that Lucene can be configured to not attempt to 
> spellcheck terms over a specified length.
> PR: https://github.com/apache/lucene-solr/pull/1103
> Dependent Solr issue: SOLR-14131



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-9102) Add maxQueryLength option to DirectSpellchecker

2019-12-23 Thread Bruno Roustant (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Roustant updated LUCENE-9102:
---
Fix Version/s: 8.5
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Add maxQueryLength option to DirectSpellchecker
> ---
>
> Key: LUCENE-9102
> URL: https://issues.apache.org/jira/browse/LUCENE-9102
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/spellchecker
>Reporter: Andy Webb
>Assignee: Bruno Roustant
>Priority: Minor
> Fix For: 8.5
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Attempting to spellcheck some long query terms can trigger 
> {{org.apache.lucene.util.automaton.TooComplexToDeterminizeException}}. This 
> change (previously discussed in SOLR-13190) adds a {{maxQueryLength}} option 
> to {{DirectSpellchecker}} so that Lucene can be configured to not attempt to 
> spellcheck terms over a specified length.
> PR: https://github.com/apache/lucene-solr/pull/1103
> Dependent Solr issue: SOLR-14131



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14131) Add maxQueryLength option to DirectSolrSpellchecker

2019-12-23 Thread Andy Webb (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Webb updated SOLR-14131:
-
Status: Patch Available  (was: Open)

> Add maxQueryLength option to DirectSolrSpellchecker
> ---
>
> Key: SOLR-14131
> URL: https://issues.apache.org/jira/browse/SOLR-14131
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Reporter: Andy Webb
>Priority: Minor
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Attempting to spellcheck some long query terms can trigger 
> org.apache.lucene.util.automaton.TooComplexToDeterminizeException. This 
> change (previously discussed in SOLR-13190, and dependent on LUCENE-9102) 
> adds a maxQueryLength option to DirectSolrSpellChecker so that Lucene/Solr 
> can be configured to not attempt to spellcheck terms over a specified length.
> Here's a PR: https://github.com/apache/lucene-solr/pull/1113 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14131) Add maxQueryLength option to DirectSolrSpellchecker

2019-12-23 Thread Bruno Roustant (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002366#comment-17002366
 ] 

Bruno Roustant commented on SOLR-14131:
---

[~andywebb1975] you closed the PR#1112. Will you link another PR?

> Add maxQueryLength option to DirectSolrSpellchecker
> ---
>
> Key: SOLR-14131
> URL: https://issues.apache.org/jira/browse/SOLR-14131
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Reporter: Andy Webb
>Priority: Minor
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Attempting to spellcheck some long query terms can trigger 
> org.apache.lucene.util.automaton.TooComplexToDeterminizeException. This 
> change (previously discussed in SOLR-13190, and dependent on LUCENE-9102) 
> adds a maxQueryLength option to DirectSolrSpellChecker so that Lucene/Solr 
> can be configured to not attempt to spellcheck terms over a specified length.
> Here's a PR: https://github.com/apache/lucene-solr/pull/1113 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14131) Add maxQueryLength option to DirectSolrSpellchecker

2019-12-23 Thread Andy Webb (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002368#comment-17002368
 ] 

Andy Webb commented on SOLR-14131:
--

hi Bruno - thanks for committing the Lucene change!

I've linked [PR 1113|https://github.com/apache/lucene-solr/pull/1113] as the 
earlier one got messy - but with a colleague's help I think I've got a decent 
test for the change - let me know what you think.

Andy

> Add maxQueryLength option to DirectSolrSpellchecker
> ---
>
> Key: SOLR-14131
> URL: https://issues.apache.org/jira/browse/SOLR-14131
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Reporter: Andy Webb
>Priority: Minor
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Attempting to spellcheck some long query terms can trigger 
> org.apache.lucene.util.automaton.TooComplexToDeterminizeException. This 
> change (previously discussed in SOLR-13190, and dependent on LUCENE-9102) 
> adds a maxQueryLength option to DirectSolrSpellChecker so that Lucene/Solr 
> can be configured to not attempt to spellcheck terms over a specified length.
> Here's a PR: https://github.com/apache/lucene-solr/pull/1113 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] madrob commented on a change in pull request #1113: SOLR-14131: adds maxQueryLength option

2019-12-23 Thread GitBox

madrob commented on a change in pull request #1113: SOLR-14131: adds 
maxQueryLength option
URL: https://github.com/apache/lucene-solr/pull/1113#discussion_r360934384
 
 

 ##
 File path: 
solr/core/src/test/org/apache/solr/spelling/DirectSolrSpellCheckerTest.java
 ##
 @@ -88,6 +88,45 @@ public void testOnlyMorePopularWithExtendedResults() throws 
Exception {
 
"//lst[@name='spellcheck']/lst[@name='suggestions']/lst[@name='fox']/arr[@name='suggestion']/lst/int[@name='freq']=2",
 "//lst[@name='spellcheck']/bool[@name='correctlySpelled']='true'"
 );
-  }  
+  }
+
+  @Test
+  public void testMaxQueryLength() throws Exception {
+testMaxQueryLength(true);
+testMaxQueryLength(false);
+  }
+
+  private void testMaxQueryLength(Boolean limitQueryLength) throws Exception {
+
+DirectSolrSpellChecker checker = new DirectSolrSpellChecker();
+NamedList spellchecker = new NamedList();
+spellchecker.add("classname", DirectSolrSpellChecker.class.getName());
+spellchecker.add(SolrSpellChecker.FIELD, "teststop");
+spellchecker.add(DirectSolrSpellChecker.MINQUERYLENGTH, 2);
+
+// demonstrate that "anothar" is not corrected when maxQueryLength is set 
to a small number
+if (limitQueryLength) 
spellchecker.add(DirectSolrSpellChecker.MAXQUERYLENGTH, 4);
+
+SolrCore core = h.getCore();
+checker.init(spellchecker, core);
+
+h.getCore().withSearcher(searcher -> {
+  Collection tokens = queryConverter.convert("anothar");
+  SpellingOptions spellOpts = new SpellingOptions(tokens, 
searcher.getIndexReader());
+  SpellingResult result = checker.getSuggestions(spellOpts);
+  assertTrue("result should not be null", result != null);
 
 Review comment:
   minor nit: we can get cleaner test failures by using `assertNotNull` here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] madrob commented on a change in pull request #1113: SOLR-14131: adds maxQueryLength option

2019-12-23 Thread GitBox

madrob commented on a change in pull request #1113: SOLR-14131: adds 
maxQueryLength option
URL: https://github.com/apache/lucene-solr/pull/1113#discussion_r360938182
 
 

 ##
 File path: 
solr/core/src/test/org/apache/solr/spelling/DirectSolrSpellCheckerTest.java
 ##
 @@ -79,7 +79,7 @@ public void test() throws Exception {
   return null;
 });
   }
-  
+
 
 Review comment:
   Yes, and it asserts that there are no results. It's the negative test case 
for the spell checking match.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] madrob commented on a change in pull request #1113: SOLR-14131: adds maxQueryLength option

2019-12-23 Thread GitBox

madrob commented on a change in pull request #1113: SOLR-14131: adds 
maxQueryLength option
URL: https://github.com/apache/lucene-solr/pull/1113#discussion_r360934621
 
 

 ##
 File path: 
solr/core/src/test/org/apache/solr/spelling/DirectSolrSpellCheckerTest.java
 ##
 @@ -88,6 +88,45 @@ public void testOnlyMorePopularWithExtendedResults() throws 
Exception {
 
"//lst[@name='spellcheck']/lst[@name='suggestions']/lst[@name='fox']/arr[@name='suggestion']/lst/int[@name='freq']=2",
 "//lst[@name='spellcheck']/bool[@name='correctlySpelled']='true'"
 );
-  }  
+  }
+
+  @Test
+  public void testMaxQueryLength() throws Exception {
+testMaxQueryLength(true);
+testMaxQueryLength(false);
+  }
+
+  private void testMaxQueryLength(Boolean limitQueryLength) throws Exception {
+
+DirectSolrSpellChecker checker = new DirectSolrSpellChecker();
+NamedList spellchecker = new NamedList();
+spellchecker.add("classname", DirectSolrSpellChecker.class.getName());
+spellchecker.add(SolrSpellChecker.FIELD, "teststop");
+spellchecker.add(DirectSolrSpellChecker.MINQUERYLENGTH, 2);
+
+// demonstrate that "anothar" is not corrected when maxQueryLength is set 
to a small number
+if (limitQueryLength) 
spellchecker.add(DirectSolrSpellChecker.MAXQUERYLENGTH, 4);
+
+SolrCore core = h.getCore();
+checker.init(spellchecker, core);
+
+h.getCore().withSearcher(searcher -> {
+  Collection tokens = queryConverter.convert("anothar");
+  SpellingOptions spellOpts = new SpellingOptions(tokens, 
searcher.getIndexReader());
+  SpellingResult result = checker.getSuggestions(spellOpts);
+  assertTrue("result should not be null", result != null);
+  Map suggestions = result.get(tokens.iterator().next());
+  assertTrue("suggestions should not be null", suggestions != null);
+
+  if (limitQueryLength) {
+assertTrue("suggestions should be empty", suggestions.isEmpty());
+  } else {
+Map.Entry entry = 
suggestions.entrySet().iterator().next();
+assertTrue(entry.getKey() + " is not equal to 'another'", 
entry.getKey().equals("another") == true);
 
 Review comment:
   minor nit: use `assertEquals` here


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14142) Enable jetty's request log by default

2019-12-23 Thread Robert Muir (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated SOLR-14142:
---
Fix Version/s: master (9.0)

> Enable jetty's request log by default
> -
>
> Key: SOLR-14142
> URL: https://issues.apache.org/jira/browse/SOLR-14142
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Robert Muir
>Priority: Major
> Fix For: master (9.0)
>
>
> I'd like to enable the jetty request log by default.
> This log is now in the correct directory, it no longer uses the deprecated 
> mechanisms (it is asynclogwriter + customformat), etc. See SOLR-14138.
> This log is in a standard format (NCSA) which is supported by tools 
> out-of-box. It does not contain challenges such as java exceptions and is 
> easy to work with. Without it enabled, solr really has insufficient logging 
> (e.g. no IP addresses).
> If someone's solr gets hacked, its only fair they at least get to see who did 
> it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] bruno-roustant commented on a change in pull request #1113: SOLR-14131: adds maxQueryLength option

2019-12-23 Thread GitBox

bruno-roustant commented on a change in pull request #1113: SOLR-14131: adds 
maxQueryLength option
URL: https://github.com/apache/lucene-solr/pull/1113#discussion_r360947790
 
 

 ##
 File path: 
solr/core/src/test/org/apache/solr/spelling/DirectSolrSpellCheckerTest.java
 ##
 @@ -79,7 +79,7 @@ public void test() throws Exception {
   return null;
 });
   }
-  
+
 
 Review comment:
   Good point. Indeed the test should be fixed line 77 to use spellOpts.tokens 
instead and expect empty suggestions.
   Would you like to fix it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14142) Enable jetty's request log by default

2019-12-23 Thread Robert Muir (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated SOLR-14142:
---
Attachment: SOLR-14142.patch

> Enable jetty's request log by default
> -
>
> Key: SOLR-14142
> URL: https://issues.apache.org/jira/browse/SOLR-14142
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Robert Muir
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: SOLR-14142.patch
>
>
> I'd like to enable the jetty request log by default.
> This log is now in the correct directory, it no longer uses the deprecated 
> mechanisms (it is asynclogwriter + customformat), etc. See SOLR-14138.
> This log is in a standard format (NCSA) which is supported by tools 
> out-of-box. It does not contain challenges such as java exceptions and is 
> easy to work with. Without it enabled, solr really has insufficient logging 
> (e.g. no IP addresses).
> If someone's solr gets hacked, its only fair they at least get to see who did 
> it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] bruno-roustant commented on a change in pull request #1113: SOLR-14131: adds maxQueryLength option

2019-12-23 Thread GitBox

bruno-roustant commented on a change in pull request #1113: SOLR-14131: adds 
maxQueryLength option
URL: https://github.com/apache/lucene-solr/pull/1113#discussion_r360949962
 
 

 ##
 File path: 
solr/core/src/test/org/apache/solr/spelling/DirectSolrSpellCheckerTest.java
 ##
 @@ -88,6 +88,45 @@ public void testOnlyMorePopularWithExtendedResults() throws 
Exception {
 
"//lst[@name='spellcheck']/lst[@name='suggestions']/lst[@name='fox']/arr[@name='suggestion']/lst/int[@name='freq']=2",
 "//lst[@name='spellcheck']/bool[@name='correctlySpelled']='true'"
 );
-  }  
+  }
+
+  @Test
+  public void testMaxQueryLength() throws Exception {
+testMaxQueryLength(true);
+testMaxQueryLength(false);
+  }
+
+  private void testMaxQueryLength(Boolean limitQueryLength) throws Exception {
+
+DirectSolrSpellChecker checker = new DirectSolrSpellChecker();
+NamedList spellchecker = new NamedList();
+spellchecker.add("classname", DirectSolrSpellChecker.class.getName());
+spellchecker.add(SolrSpellChecker.FIELD, "teststop");
+spellchecker.add(DirectSolrSpellChecker.MINQUERYLENGTH, 2);
+
+// demonstrate that "anothar" is not corrected when maxQueryLength is set 
to a small number
+if (limitQueryLength) 
spellchecker.add(DirectSolrSpellChecker.MAXQUERYLENGTH, 4);
+
+SolrCore core = h.getCore();
+checker.init(spellchecker, core);
+
+h.getCore().withSearcher(searcher -> {
+  Collection tokens = queryConverter.convert("anothar");
+  SpellingOptions spellOpts = new SpellingOptions(tokens, 
searcher.getIndexReader());
+  SpellingResult result = checker.getSuggestions(spellOpts);
+  assertTrue("result should not be null", result != null);
+  Map suggestions = result.get(tokens.iterator().next());
+  assertTrue("suggestions should not be null", suggestions != null);
+
+  if (limitQueryLength) {
+assertTrue("suggestions should be empty", suggestions.isEmpty());
+  } else {
+Map.Entry entry = 
suggestions.entrySet().iterator().next();
 
 Review comment:
   Maybe we could insert an assertFalse(suggestions.isEmpty()) otherwise the 
line below will throw a NoSuchElementException less nice in a test.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] bruno-roustant commented on a change in pull request #1113: SOLR-14131: adds maxQueryLength option

2019-12-23 Thread GitBox

bruno-roustant commented on a change in pull request #1113: SOLR-14131: adds 
maxQueryLength option
URL: https://github.com/apache/lucene-solr/pull/1113#discussion_r360949352
 
 

 ##
 File path: 
solr/core/src/test/org/apache/solr/spelling/DirectSolrSpellCheckerTest.java
 ##
 @@ -88,6 +88,45 @@ public void testOnlyMorePopularWithExtendedResults() throws 
Exception {
 
"//lst[@name='spellcheck']/lst[@name='suggestions']/lst[@name='fox']/arr[@name='suggestion']/lst/int[@name='freq']=2",
 "//lst[@name='spellcheck']/bool[@name='correctlySpelled']='true'"
 );
-  }  
+  }
+
+  @Test
+  public void testMaxQueryLength() throws Exception {
+testMaxQueryLength(true);
+testMaxQueryLength(false);
+  }
+
+  private void testMaxQueryLength(Boolean limitQueryLength) throws Exception {
+
+DirectSolrSpellChecker checker = new DirectSolrSpellChecker();
+NamedList spellchecker = new NamedList();
+spellchecker.add("classname", DirectSolrSpellChecker.class.getName());
+spellchecker.add(SolrSpellChecker.FIELD, "teststop");
+spellchecker.add(DirectSolrSpellChecker.MINQUERYLENGTH, 2);
+
+// demonstrate that "anothar" is not corrected when maxQueryLength is set 
to a small number
+if (limitQueryLength) 
spellchecker.add(DirectSolrSpellChecker.MAXQUERYLENGTH, 4);
+
+SolrCore core = h.getCore();
+checker.init(spellchecker, core);
+
+h.getCore().withSearcher(searcher -> {
+  Collection tokens = queryConverter.convert("anothar");
+  SpellingOptions spellOpts = new SpellingOptions(tokens, 
searcher.getIndexReader());
+  SpellingResult result = checker.getSuggestions(spellOpts);
+  assertTrue("result should not be null", result != null);
+  Map suggestions = result.get(tokens.iterator().next());
+  assertTrue("suggestions should not be null", suggestions != null);
 
 Review comment:
   assertNotNull?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] bruno-roustant commented on a change in pull request #1113: SOLR-14131: adds maxQueryLength option

2019-12-23 Thread GitBox

bruno-roustant commented on a change in pull request #1113: SOLR-14131: adds 
maxQueryLength option
URL: https://github.com/apache/lucene-solr/pull/1113#discussion_r360949293
 
 

 ##
 File path: 
solr/core/src/test/org/apache/solr/spelling/DirectSolrSpellCheckerTest.java
 ##
 @@ -88,6 +88,45 @@ public void testOnlyMorePopularWithExtendedResults() throws 
Exception {
 
"//lst[@name='spellcheck']/lst[@name='suggestions']/lst[@name='fox']/arr[@name='suggestion']/lst/int[@name='freq']=2",
 "//lst[@name='spellcheck']/bool[@name='correctlySpelled']='true'"
 );
-  }  
+  }
+
+  @Test
+  public void testMaxQueryLength() throws Exception {
+testMaxQueryLength(true);
+testMaxQueryLength(false);
+  }
+
+  private void testMaxQueryLength(Boolean limitQueryLength) throws Exception {
+
+DirectSolrSpellChecker checker = new DirectSolrSpellChecker();
+NamedList spellchecker = new NamedList();
+spellchecker.add("classname", DirectSolrSpellChecker.class.getName());
+spellchecker.add(SolrSpellChecker.FIELD, "teststop");
+spellchecker.add(DirectSolrSpellChecker.MINQUERYLENGTH, 2);
+
+// demonstrate that "anothar" is not corrected when maxQueryLength is set 
to a small number
+if (limitQueryLength) 
spellchecker.add(DirectSolrSpellChecker.MAXQUERYLENGTH, 4);
+
+SolrCore core = h.getCore();
+checker.init(spellchecker, core);
+
+h.getCore().withSearcher(searcher -> {
+  Collection tokens = queryConverter.convert("anothar");
+  SpellingOptions spellOpts = new SpellingOptions(tokens, 
searcher.getIndexReader());
+  SpellingResult result = checker.getSuggestions(spellOpts);
+  assertTrue("result should not be null", result != null);
 
 Review comment:
   assertNotNull?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] bruno-roustant commented on a change in pull request #1113: SOLR-14131: adds maxQueryLength option

2019-12-23 Thread GitBox

bruno-roustant commented on a change in pull request #1113: SOLR-14131: adds 
maxQueryLength option
URL: https://github.com/apache/lucene-solr/pull/1113#discussion_r360949433
 
 

 ##
 File path: 
solr/core/src/test/org/apache/solr/spelling/DirectSolrSpellCheckerTest.java
 ##
 @@ -88,6 +88,45 @@ public void testOnlyMorePopularWithExtendedResults() throws 
Exception {
 
"//lst[@name='spellcheck']/lst[@name='suggestions']/lst[@name='fox']/arr[@name='suggestion']/lst/int[@name='freq']=2",
 "//lst[@name='spellcheck']/bool[@name='correctlySpelled']='true'"
 );
-  }  
+  }
+
+  @Test
+  public void testMaxQueryLength() throws Exception {
+testMaxQueryLength(true);
+testMaxQueryLength(false);
+  }
+
+  private void testMaxQueryLength(Boolean limitQueryLength) throws Exception {
+
+DirectSolrSpellChecker checker = new DirectSolrSpellChecker();
+NamedList spellchecker = new NamedList();
+spellchecker.add("classname", DirectSolrSpellChecker.class.getName());
+spellchecker.add(SolrSpellChecker.FIELD, "teststop");
+spellchecker.add(DirectSolrSpellChecker.MINQUERYLENGTH, 2);
+
+// demonstrate that "anothar" is not corrected when maxQueryLength is set 
to a small number
+if (limitQueryLength) 
spellchecker.add(DirectSolrSpellChecker.MAXQUERYLENGTH, 4);
+
+SolrCore core = h.getCore();
+checker.init(spellchecker, core);
+
+h.getCore().withSearcher(searcher -> {
+  Collection tokens = queryConverter.convert("anothar");
+  SpellingOptions spellOpts = new SpellingOptions(tokens, 
searcher.getIndexReader());
+  SpellingResult result = checker.getSuggestions(spellOpts);
+  assertTrue("result should not be null", result != null);
+  Map suggestions = result.get(tokens.iterator().next());
+  assertTrue("suggestions should not be null", suggestions != null);
+
+  if (limitQueryLength) {
+assertTrue("suggestions should be empty", suggestions.isEmpty());
+  } else {
+Map.Entry entry = 
suggestions.entrySet().iterator().next();
+assertTrue(entry.getKey() + " is not equal to 'another'", 
entry.getKey().equals("another") == true);
 
 Review comment:
   assertEquals


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] bruno-roustant commented on a change in pull request #1113: SOLR-14131: adds maxQueryLength option

2019-12-23 Thread GitBox

bruno-roustant commented on a change in pull request #1113: SOLR-14131: adds 
maxQueryLength option
URL: https://github.com/apache/lucene-solr/pull/1113#discussion_r360949198
 
 

 ##
 File path: 
solr/core/src/test/org/apache/solr/spelling/DirectSolrSpellCheckerTest.java
 ##
 @@ -88,6 +88,45 @@ public void testOnlyMorePopularWithExtendedResults() throws 
Exception {
 
"//lst[@name='spellcheck']/lst[@name='suggestions']/lst[@name='fox']/arr[@name='suggestion']/lst/int[@name='freq']=2",
 "//lst[@name='spellcheck']/bool[@name='correctlySpelled']='true'"
 );
-  }  
+  }
+
+  @Test
+  public void testMaxQueryLength() throws Exception {
+testMaxQueryLength(true);
+testMaxQueryLength(false);
+  }
+
+  private void testMaxQueryLength(Boolean limitQueryLength) throws Exception {
+
+DirectSolrSpellChecker checker = new DirectSolrSpellChecker();
+NamedList spellchecker = new NamedList();
 
 Review comment:
   We should use generics here: NamedList spellchecker = new 
NamedList<>();


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] bruno-roustant commented on a change in pull request #1113: SOLR-14131: adds maxQueryLength option

2019-12-23 Thread GitBox

bruno-roustant commented on a change in pull request #1113: SOLR-14131: adds 
maxQueryLength option
URL: https://github.com/apache/lucene-solr/pull/1113#discussion_r360950592
 
 

 ##
 File path: 
solr/core/src/test/org/apache/solr/spelling/DirectSolrSpellCheckerTest.java
 ##
 @@ -79,7 +79,7 @@ public void test() throws Exception {
   return null;
 });
   }
-  
+
 
 Review comment:
   (comment race)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] bruno-roustant commented on a change in pull request #1113: SOLR-14131: adds maxQueryLength option

2019-12-23 Thread GitBox

bruno-roustant commented on a change in pull request #1113: SOLR-14131: adds 
maxQueryLength option
URL: https://github.com/apache/lucene-solr/pull/1113#discussion_r360950705
 
 

 ##
 File path: 
solr/core/src/test/org/apache/solr/spelling/DirectSolrSpellCheckerTest.java
 ##
 @@ -88,6 +88,45 @@ public void testOnlyMorePopularWithExtendedResults() throws 
Exception {
 
"//lst[@name='spellcheck']/lst[@name='suggestions']/lst[@name='fox']/arr[@name='suggestion']/lst/int[@name='freq']=2",
 "//lst[@name='spellcheck']/bool[@name='correctlySpelled']='true'"
 );
-  }  
+  }
+
+  @Test
+  public void testMaxQueryLength() throws Exception {
+testMaxQueryLength(true);
+testMaxQueryLength(false);
+  }
+
+  private void testMaxQueryLength(Boolean limitQueryLength) throws Exception {
+
+DirectSolrSpellChecker checker = new DirectSolrSpellChecker();
+NamedList spellchecker = new NamedList();
+spellchecker.add("classname", DirectSolrSpellChecker.class.getName());
+spellchecker.add(SolrSpellChecker.FIELD, "teststop");
+spellchecker.add(DirectSolrSpellChecker.MINQUERYLENGTH, 2);
+
+// demonstrate that "anothar" is not corrected when maxQueryLength is set 
to a small number
+if (limitQueryLength) 
spellchecker.add(DirectSolrSpellChecker.MAXQUERYLENGTH, 4);
+
+SolrCore core = h.getCore();
+checker.init(spellchecker, core);
+
+h.getCore().withSearcher(searcher -> {
+  Collection tokens = queryConverter.convert("anothar");
+  SpellingOptions spellOpts = new SpellingOptions(tokens, 
searcher.getIndexReader());
+  SpellingResult result = checker.getSuggestions(spellOpts);
+  assertTrue("result should not be null", result != null);
+  Map suggestions = result.get(tokens.iterator().next());
+  assertTrue("suggestions should not be null", suggestions != null);
+
+  if (limitQueryLength) {
+assertTrue("suggestions should be empty", suggestions.isEmpty());
+  } else {
+Map.Entry entry = 
suggestions.entrySet().iterator().next();
+assertTrue(entry.getKey() + " is not equal to 'another'", 
entry.getKey().equals("another") == true);
 
 Review comment:
   (comment race)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13890) Add postfilter support to {!terms} queries

2019-12-23 Thread Jason Gerlowski (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002388#comment-17002388
 ] 

Jason Gerlowski commented on SOLR-13890:


bq.  I highly doubt the PostFilter abstraction somehow offers a perf benefit in 
your benchmark that cannot be achieved with TwoPhaseIterator

I'm leaning on your correction a bit here as you're more familiar with the 
Lucene code than I am.  But as I read the TPI implementation for 
DocValuesTermsQuery, I see one reason why a postfilter impl might be faster 
(other than segment-level vs top-level)

The TPI "approximation" for DocValuesTermsQuery is the unfiltered doc-values 
structure for the field.  As a result TPI {{matches()}} is going to be called 
on all documents that have any value at all for the field in question.  Under a 
post-filter implementation, the bitset lookup is (potentially) called much less 
frequently, as we only lookup values for docs that have matched all the other 
(non-postfilter) query clauses.  Does that make sense, or am I off-base 
[~dsmiley]?

In either case, this is hypothetical.  The real proof is in a perf experiment.  
I'm putting one together now to share soon.

bq. Though I don't know whether the details of my test would have tripped 
whatever heuristics Lucene uses to turn TPI on/off.

As best as I can tell from the 
[code|https://github.com/apache/lucene-solr/blob/174cc63bad411eace196a6c7028bdd24864fefed/lucene/sandbox/src/java/org/apache/lucene/search/DocValuesTermsQuery.java#L218],
 it looks like DVTQ always uses TPI processing.  So there's no particular 
concern about ensuring that logic is triggered when I perf test.

> Add postfilter support to {!terms} queries
> --
>
> Key: SOLR-13890
> URL: https://issues.apache.org/jira/browse/SOLR-13890
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query parsers
>Affects Versions: master (9.0)
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Major
> Attachments: SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch
>
>
> There are some use-cases where it'd be nice if the "terms" qparser created a 
> query that could be run as a postfilter.  Particularly, when users are 
> checking for hundreds or thousands of terms, a postfilter implementation can 
> be more performant than the standard processing.
> WIth this issue, I'd like to propose a post-filter implementation for the 
> {{docValuesTermsFilter}} "method".  Postfilter creation can use a 
> SortedSetDocValues object to populate a DV bitset with the "terms" being 
> checked for.  Each document run through the post-filter can look at their 
> doc-values for the field in question and check them efficiently against the 
> constructed bitset.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13890) Add postfilter support to {!terms} queries

2019-12-23 Thread Joel Bernstein (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002416#comment-17002416
 ] 

Joel Bernstein commented on SOLR-13890:
---

I dug into this pretty deeply and I believe there is large advantage to top 
level doc values approach when there is a large number of terms. The reason is 
that *MultiSortedSetDocValues.lookupOrd*  (in MultiDocValues) is really clever, 
so the overhead of doing the top level term lookup is much less than doing the 
segment by segment term lookups. Using the top level ordinals inside of the 
scorer would be possible also but seemed kind of awkward. But, in theory using 
top level ordinals in the scorer would get as similar performance as this patch.

> Add postfilter support to {!terms} queries
> --
>
> Key: SOLR-13890
> URL: https://issues.apache.org/jira/browse/SOLR-13890
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query parsers
>Affects Versions: master (9.0)
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Major
> Attachments: SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch
>
>
> There are some use-cases where it'd be nice if the "terms" qparser created a 
> query that could be run as a postfilter.  Particularly, when users are 
> checking for hundreds or thousands of terms, a postfilter implementation can 
> be more performant than the standard processing.
> WIth this issue, I'd like to propose a post-filter implementation for the 
> {{docValuesTermsFilter}} "method".  Postfilter creation can use a 
> SortedSetDocValues object to populate a DV bitset with the "terms" being 
> checked for.  Each document run through the post-filter can look at their 
> doc-values for the field in question and check them efficiently against the 
> constructed bitset.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-13890) Add postfilter support to {!terms} queries

2019-12-23 Thread Joel Bernstein (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002416#comment-17002416
 ] 

Joel Bernstein edited comment on SOLR-13890 at 12/23/19 6:09 PM:
-

I dug into this pretty deeply and I believe there is large advantage to the top 
level doc values approach when there is a large number of terms. The reason is 
that *MultiSortedSetDocValues.lookupOrd*  (in MultiDocValues) is really clever, 
so the overhead of doing the top level term lookup is much less than doing the 
segment by segment term lookups. Using the top level ordinals inside of the 
scorer would be possible also but seemed kind of awkward. But, in theory using 
top level ordinals in the scorer would get similar similar performance as this 
patch.


was (Author: joel.bernstein):
I dug into this pretty deeply and I believe there is large advantage to top 
level doc values approach when there is a large number of terms. The reason is 
that *MultiSortedSetDocValues.lookupOrd*  (in MultiDocValues) is really clever, 
so the overhead of doing the top level term lookup is much less than doing the 
segment by segment term lookups. Using the top level ordinals inside of the 
scorer would be possible also but seemed kind of awkward. But, in theory using 
top level ordinals in the scorer would get as similar performance as this patch.

> Add postfilter support to {!terms} queries
> --
>
> Key: SOLR-13890
> URL: https://issues.apache.org/jira/browse/SOLR-13890
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query parsers
>Affects Versions: master (9.0)
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Major
> Attachments: SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch
>
>
> There are some use-cases where it'd be nice if the "terms" qparser created a 
> query that could be run as a postfilter.  Particularly, when users are 
> checking for hundreds or thousands of terms, a postfilter implementation can 
> be more performant than the standard processing.
> WIth this issue, I'd like to propose a post-filter implementation for the 
> {{docValuesTermsFilter}} "method".  Postfilter creation can use a 
> SortedSetDocValues object to populate a DV bitset with the "terms" being 
> checked for.  Each document run through the post-filter can look at their 
> doc-values for the field in question and check them efficiently against the 
> constructed bitset.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13890) Add postfilter support to {!terms} queries

2019-12-23 Thread Joel Bernstein (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002420#comment-17002420
 ] 

Joel Bernstein commented on SOLR-13890:
---

We have code somewhat similar to this patch deployed with a cross-core join 
that provides sub-second performance with 50,000 join terms. We will not 
achieve that with the terms query because 50,000 terms is too large to pass in 
efficiently, but the term lookups are scalable with the top level ordinal 
approach.

> Add postfilter support to {!terms} queries
> --
>
> Key: SOLR-13890
> URL: https://issues.apache.org/jira/browse/SOLR-13890
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query parsers
>Affects Versions: master (9.0)
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Major
> Attachments: SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch
>
>
> There are some use-cases where it'd be nice if the "terms" qparser created a 
> query that could be run as a postfilter.  Particularly, when users are 
> checking for hundreds or thousands of terms, a postfilter implementation can 
> be more performant than the standard processing.
> WIth this issue, I'd like to propose a post-filter implementation for the 
> {{docValuesTermsFilter}} "method".  Postfilter creation can use a 
> SortedSetDocValues object to populate a DV bitset with the "terms" being 
> checked for.  Each document run through the post-filter can look at their 
> doc-values for the field in question and check them efficiently against the 
> constructed bitset.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-13890) Add postfilter support to {!terms} queries

2019-12-23 Thread Joel Bernstein (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002416#comment-17002416
 ] 

Joel Bernstein edited comment on SOLR-13890 at 12/23/19 6:15 PM:
-

I dug into this pretty deeply and I believe there is a large advantage to the 
top level doc values approach when there is a large number of terms. The reason 
is that *MultiSortedSetDocValues.lookupOrd*  (in MultiDocValues) is really 
clever, so the overhead of doing the top level term lookup is much less than 
doing the segment by segment term lookups. Using the top level ordinals inside 
of the scorer would be possible also but seemed kind of awkward. But, in theory 
using top level ordinals in the scorer would get similar similar performance as 
this patch.


was (Author: joel.bernstein):
I dug into this pretty deeply and I believe there is large advantage to the top 
level doc values approach when there is a large number of terms. The reason is 
that *MultiSortedSetDocValues.lookupOrd*  (in MultiDocValues) is really clever, 
so the overhead of doing the top level term lookup is much less than doing the 
segment by segment term lookups. Using the top level ordinals inside of the 
scorer would be possible also but seemed kind of awkward. But, in theory using 
top level ordinals in the scorer would get similar similar performance as this 
patch.

> Add postfilter support to {!terms} queries
> --
>
> Key: SOLR-13890
> URL: https://issues.apache.org/jira/browse/SOLR-13890
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query parsers
>Affects Versions: master (9.0)
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Major
> Attachments: SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch
>
>
> There are some use-cases where it'd be nice if the "terms" qparser created a 
> query that could be run as a postfilter.  Particularly, when users are 
> checking for hundreds or thousands of terms, a postfilter implementation can 
> be more performant than the standard processing.
> WIth this issue, I'd like to propose a post-filter implementation for the 
> {{docValuesTermsFilter}} "method".  Postfilter creation can use a 
> SortedSetDocValues object to populate a DV bitset with the "terms" being 
> checked for.  Each document run through the post-filter can look at their 
> doc-values for the field in question and check them efficiently against the 
> constructed bitset.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-13890) Add postfilter support to {!terms} queries

2019-12-23 Thread Joel Bernstein (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002416#comment-17002416
 ] 

Joel Bernstein edited comment on SOLR-13890 at 12/23/19 6:18 PM:
-

I dug into this pretty deeply and I believe there is a large advantage to the 
top level doc values approach when there is a large number of terms. The reason 
is that *MultiSortedSetDocValues.lookupOrd*  (in MultiDocValues) is really 
clever, so the overhead of doing the top level term lookup is much less than 
doing the segment by segment term lookups. Using the top level ordinals inside 
of the scorer would be possible also but seemed kind of awkward. But, in theory 
using top level ordinals in the scorer would get similar performance as this 
patch.


was (Author: joel.bernstein):
I dug into this pretty deeply and I believe there is a large advantage to the 
top level doc values approach when there is a large number of terms. The reason 
is that *MultiSortedSetDocValues.lookupOrd*  (in MultiDocValues) is really 
clever, so the overhead of doing the top level term lookup is much less than 
doing the segment by segment term lookups. Using the top level ordinals inside 
of the scorer would be possible also but seemed kind of awkward. But, in theory 
using top level ordinals in the scorer would get similar similar performance as 
this patch.

> Add postfilter support to {!terms} queries
> --
>
> Key: SOLR-13890
> URL: https://issues.apache.org/jira/browse/SOLR-13890
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query parsers
>Affects Versions: master (9.0)
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Major
> Attachments: SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch
>
>
> There are some use-cases where it'd be nice if the "terms" qparser created a 
> query that could be run as a postfilter.  Particularly, when users are 
> checking for hundreds or thousands of terms, a postfilter implementation can 
> be more performant than the standard processing.
> WIth this issue, I'd like to propose a post-filter implementation for the 
> {{docValuesTermsFilter}} "method".  Postfilter creation can use a 
> SortedSetDocValues object to populate a DV bitset with the "terms" being 
> checked for.  Each document run through the post-filter can look at their 
> doc-values for the field in question and check them efficiently against the 
> constructed bitset.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13890) Add postfilter support to {!terms} queries

2019-12-23 Thread Joel Bernstein (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002424#comment-17002424
 ] 

Joel Bernstein commented on SOLR-13890:
---

The other really big aspect of this is caching.

Even though scorer based filter can be fast if it's applied with the main 
query, in Solr that's not going to happen.

The reason is the filter cache. Which will apply the filter against the entire 
index and create a DocSet to cache. 

Our filter cache is top level so it gets dumped after a single document is 
loaded. So scenarios where there is lot's of indexing going the filter cache 
becomes problematic. 

There are ways around this issue, like turning off caching using local params, 
or not using filter queries. But these approaches are not what users typically 
do with a filter.

So, the postfilters behavior provides the best solution for certain situations 
where the filter cache is problematic.

 

 

> Add postfilter support to {!terms} queries
> --
>
> Key: SOLR-13890
> URL: https://issues.apache.org/jira/browse/SOLR-13890
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query parsers
>Affects Versions: master (9.0)
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Major
> Attachments: SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch
>
>
> There are some use-cases where it'd be nice if the "terms" qparser created a 
> query that could be run as a postfilter.  Particularly, when users are 
> checking for hundreds or thousands of terms, a postfilter implementation can 
> be more performant than the standard processing.
> WIth this issue, I'd like to propose a post-filter implementation for the 
> {{docValuesTermsFilter}} "method".  Postfilter creation can use a 
> SortedSetDocValues object to populate a DV bitset with the "terms" being 
> checked for.  Each document run through the post-filter can look at their 
> doc-values for the field in question and check them efficiently against the 
> constructed bitset.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-13890) Add postfilter support to {!terms} queries

2019-12-23 Thread Joel Bernstein (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002424#comment-17002424
 ] 

Joel Bernstein edited comment on SOLR-13890 at 12/23/19 6:27 PM:
-

The other really big aspect of this is caching.

Even though the scorer based filter can be fast if it's applied with the main 
query, in Solr that's not going to happen.

The reason is the filter cache. Which will apply the filter against the entire 
index and create a DocSet to cache. 

Our filter cache is top level so it gets dumped after a single document is 
loaded. So scenarios where there is lot's of indexing going the filter cache 
becomes problematic. 

There are ways around this issue, like turning off caching using local params, 
or not using filter queries. But these approaches are not what users typically 
do with a filter.

So, the postfilters behavior provides the best solution for certain situations 
where the filter cache is problematic.

 

 


was (Author: joel.bernstein):
The other really big aspect of this is caching.

Even though scorer based filter can be fast if it's applied with the main 
query, in Solr that's not going to happen.

The reason is the filter cache. Which will apply the filter against the entire 
index and create a DocSet to cache. 

Our filter cache is top level so it gets dumped after a single document is 
loaded. So scenarios where there is lot's of indexing going the filter cache 
becomes problematic. 

There are ways around this issue, like turning off caching using local params, 
or not using filter queries. But these approaches are not what users typically 
do with a filter.

So, the postfilters behavior provides the best solution for certain situations 
where the filter cache is problematic.

 

 

> Add postfilter support to {!terms} queries
> --
>
> Key: SOLR-13890
> URL: https://issues.apache.org/jira/browse/SOLR-13890
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query parsers
>Affects Versions: master (9.0)
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Major
> Attachments: SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch
>
>
> There are some use-cases where it'd be nice if the "terms" qparser created a 
> query that could be run as a postfilter.  Particularly, when users are 
> checking for hundreds or thousands of terms, a postfilter implementation can 
> be more performant than the standard processing.
> WIth this issue, I'd like to propose a post-filter implementation for the 
> {{docValuesTermsFilter}} "method".  Postfilter creation can use a 
> SortedSetDocValues object to populate a DV bitset with the "terms" being 
> checked for.  Each document run through the post-filter can look at their 
> doc-values for the field in question and check them efficiently against the 
> constructed bitset.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-13890) Add postfilter support to {!terms} queries

2019-12-23 Thread Joel Bernstein (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002424#comment-17002424
 ] 

Joel Bernstein edited comment on SOLR-13890 at 12/23/19 6:28 PM:
-

The other really big aspect of this is caching.

Even though the scorer based filter can be fast if it's applied with the main 
query, in Solr that's not going to happen.

The reason is the filter cache. Which will apply the filter against the entire 
index and create a DocSet to cache. 

Our filter cache is top level so it gets dumped after a single document is 
loaded. So in scenarios where there is lot's of indexing going on the filter 
cache becomes problematic. 

There are ways around this issue, like turning off caching using local params, 
or not using filter queries. But these approaches are not what users typically 
do with a filter.

So, the postfilters behavior provides the best solution for certain situations 
where the filter cache is problematic.

 

 


was (Author: joel.bernstein):
The other really big aspect of this is caching.

Even though the scorer based filter can be fast if it's applied with the main 
query, in Solr that's not going to happen.

The reason is the filter cache. Which will apply the filter against the entire 
index and create a DocSet to cache. 

Our filter cache is top level so it gets dumped after a single document is 
loaded. So scenarios where there is lot's of indexing going the filter cache 
becomes problematic. 

There are ways around this issue, like turning off caching using local params, 
or not using filter queries. But these approaches are not what users typically 
do with a filter.

So, the postfilters behavior provides the best solution for certain situations 
where the filter cache is problematic.

 

 

> Add postfilter support to {!terms} queries
> --
>
> Key: SOLR-13890
> URL: https://issues.apache.org/jira/browse/SOLR-13890
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query parsers
>Affects Versions: master (9.0)
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Major
> Attachments: SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch
>
>
> There are some use-cases where it'd be nice if the "terms" qparser created a 
> query that could be run as a postfilter.  Particularly, when users are 
> checking for hundreds or thousands of terms, a postfilter implementation can 
> be more performant than the standard processing.
> WIth this issue, I'd like to propose a post-filter implementation for the 
> {{docValuesTermsFilter}} "method".  Postfilter creation can use a 
> SortedSetDocValues object to populate a DV bitset with the "terms" being 
> checked for.  Each document run through the post-filter can look at their 
> doc-values for the field in question and check them efficiently against the 
> constructed bitset.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-13890) Add postfilter support to {!terms} queries

2019-12-23 Thread Joel Bernstein (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002424#comment-17002424
 ] 

Joel Bernstein edited comment on SOLR-13890 at 12/23/19 6:30 PM:
-

The other really big aspect of this is caching.

Even though the scorer based filter can be fast if it's applied with the main 
query, in Solr that's not going to happen.

The reason is the filter cache. Which will apply the filter against the entire 
index and create a DocSet to cache. 

Our filter cache is top level so it gets dumped after a single document is 
loaded. So in scenarios where there is lot's of indexing going on the filter 
cache becomes problematic. 

There are ways around this issue, like turning off caching using local params, 
or not using filter queries. But these approaches are not what users typically 
do with a filter.

So, the postfilters behavior (not cached in filter cache) provides the best 
solution for certain situations where the filter cache is problematic.

 

 


was (Author: joel.bernstein):
The other really big aspect of this is caching.

Even though the scorer based filter can be fast if it's applied with the main 
query, in Solr that's not going to happen.

The reason is the filter cache. Which will apply the filter against the entire 
index and create a DocSet to cache. 

Our filter cache is top level so it gets dumped after a single document is 
loaded. So in scenarios where there is lot's of indexing going on the filter 
cache becomes problematic. 

There are ways around this issue, like turning off caching using local params, 
or not using filter queries. But these approaches are not what users typically 
do with a filter.

So, the postfilters behavior provides the best solution for certain situations 
where the filter cache is problematic.

 

 

> Add postfilter support to {!terms} queries
> --
>
> Key: SOLR-13890
> URL: https://issues.apache.org/jira/browse/SOLR-13890
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query parsers
>Affects Versions: master (9.0)
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Major
> Attachments: SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch
>
>
> There are some use-cases where it'd be nice if the "terms" qparser created a 
> query that could be run as a postfilter.  Particularly, when users are 
> checking for hundreds or thousands of terms, a postfilter implementation can 
> be more performant than the standard processing.
> WIth this issue, I'd like to propose a post-filter implementation for the 
> {{docValuesTermsFilter}} "method".  Postfilter creation can use a 
> SortedSetDocValues object to populate a DV bitset with the "terms" being 
> checked for.  Each document run through the post-filter can look at their 
> doc-values for the field in question and check them efficiently against the 
> constructed bitset.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14138) Fix commented-out RequestLog in jetty.xml to use non-deprecated class

2019-12-23 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002438#comment-17002438
 ] 

ASF subversion and git services commented on SOLR-14138:


Commit 403fd05646c32981ca15637678602eb12c5239d7 in lucene-solr's branch 
refs/heads/master from Robert Muir
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=403fd05 ]

SOLR-14138: changes.txt


> Fix commented-out RequestLog in jetty.xml to use non-deprecated class
> -
>
> Key: SOLR-14138
> URL: https://issues.apache.org/jira/browse/SOLR-14138
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Major
> Fix For: 8.5
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently the jetty request logging is disabled (commented out). 
> But it can be useful, e.g. since it uses a standard logging format and there 
> are tools to analyze it by default. Also it can be used to detect some 
> attacks not otherwise logged anywhere else, since they don't make it to solr 
> servlet: requests blocked at the jetty level (invalid/malformed requests, 
> ones filtered by jetty IP filtering, etc).
> We should switch it from the deprecated NCSARequestLog class, instead to use 
> the CustomRequestLog with either NCSA_FORMAT or EXTENDED_NCSA_FORMAT.
> {quote}
> Deprecated.
> use CustomRequestLog given format string 
> CustomRequestLog.EXTENDED_NCSA_FORMAT with a RequestLogWriter
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14138) Fix commented-out RequestLog in jetty.xml to use non-deprecated class

2019-12-23 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002439#comment-17002439
 ] 

ASF subversion and git services commented on SOLR-14138:


Commit f1a674717a3c97784826b1c1b5fb2bb1cdc9d581 in lucene-solr's branch 
refs/heads/branch_8x from Robert Muir
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=f1a6747 ]

SOLR-14138: changes.txt


> Fix commented-out RequestLog in jetty.xml to use non-deprecated class
> -
>
> Key: SOLR-14138
> URL: https://issues.apache.org/jira/browse/SOLR-14138
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Major
> Fix For: 8.5
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently the jetty request logging is disabled (commented out). 
> But it can be useful, e.g. since it uses a standard logging format and there 
> are tools to analyze it by default. Also it can be used to detect some 
> attacks not otherwise logged anywhere else, since they don't make it to solr 
> servlet: requests blocked at the jetty level (invalid/malformed requests, 
> ones filtered by jetty IP filtering, etc).
> We should switch it from the deprecated NCSARequestLog class, instead to use 
> the CustomRequestLog with either NCSA_FORMAT or EXTENDED_NCSA_FORMAT.
> {quote}
> Deprecated.
> use CustomRequestLog given format string 
> CustomRequestLog.EXTENDED_NCSA_FORMAT with a RequestLogWriter
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-13890) Add postfilter support to {!terms} queries

2019-12-23 Thread Joel Bernstein (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002424#comment-17002424
 ] 

Joel Bernstein edited comment on SOLR-13890 at 12/23/19 6:52 PM:
-

The other really big aspect of this is caching.

Even though the scorer based filter can be fast if it's applied with the main 
query, in Solr that's not going to happen.

The reason is the filter cache. Solr will apply the filter against the entire 
index and create a DocSet to cache. 

Our filter cache is top level so it gets dumped after a single document is 
loaded. So in scenarios where there is lot's of indexing going on the filter 
cache becomes problematic. 

There are ways around this issue, like turning off caching using local params, 
or not using filter queries. But these approaches are not what users typically 
do with a filter.

So, the postfilters behavior (not cached in filter cache) provides the best 
solution for certain situations where the filter cache is problematic.

 

 


was (Author: joel.bernstein):
The other really big aspect of this is caching.

Even though the scorer based filter can be fast if it's applied with the main 
query, in Solr that's not going to happen.

The reason is the filter cache. Which will apply the filter against the entire 
index and create a DocSet to cache. 

Our filter cache is top level so it gets dumped after a single document is 
loaded. So in scenarios where there is lot's of indexing going on the filter 
cache becomes problematic. 

There are ways around this issue, like turning off caching using local params, 
or not using filter queries. But these approaches are not what users typically 
do with a filter.

So, the postfilters behavior (not cached in filter cache) provides the best 
solution for certain situations where the filter cache is problematic.

 

 

> Add postfilter support to {!terms} queries
> --
>
> Key: SOLR-13890
> URL: https://issues.apache.org/jira/browse/SOLR-13890
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query parsers
>Affects Versions: master (9.0)
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Major
> Attachments: SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch
>
>
> There are some use-cases where it'd be nice if the "terms" qparser created a 
> query that could be run as a postfilter.  Particularly, when users are 
> checking for hundreds or thousands of terms, a postfilter implementation can 
> be more performant than the standard processing.
> WIth this issue, I'd like to propose a post-filter implementation for the 
> {{docValuesTermsFilter}} "method".  Postfilter creation can use a 
> SortedSetDocValues object to populate a DV bitset with the "terms" being 
> checked for.  Each document run through the post-filter can look at their 
> doc-values for the field in question and check them efficiently against the 
> constructed bitset.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9109) Use Java 9+ StackWalker to implement TestSecurityManager's detection of JVM exit

2019-12-23 Thread Uwe Schindler (Jira)

Uwe Schindler created LUCENE-9109:
-

 Summary: Use Java 9+ StackWalker to implement 
TestSecurityManager's detection of JVM exit
 Key: LUCENE-9109
 URL: https://issues.apache.org/jira/browse/LUCENE-9109
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/test-framework
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: master (9.0)


This is just a small improvement in Lucene/Solr master (Java 11) to detect exit 
of JVM in our test framework. There are other places in Lucene that use 
ineffective ways to inspect the stack trace.

This one optimizes the implementation of TestSecurityManager#checkExit(status) 
to disallow all JVM exits outside of the official test runner by using 
StackWalker. In addition this needs no additional permissions, because we do 
not instruct StackWalker to fetch all crazy stuff like Class instances of stack 
elements.

The way how this works is: Walk through stack trace:
- skip all internal frames (those which come before the actual exit call)
- skip all frmes with the actual exit call
- limit to one more frame (the method calling System.exit())
- check if that remaining frame is on our whitelist

This can only be commited to master (9.0), as it requires Java 9.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] uschindler opened a new pull request #1114: LUCENE-9109: Use stack walker to implement TestSecurityManager's detection of test JVM exit

2019-12-23 Thread GitBox

uschindler opened a new pull request #1114: LUCENE-9109: Use stack walker to 
implement TestSecurityManager's detection of test JVM exit
URL: https://github.com/apache/lucene-solr/pull/1114
 
 
   This is just a small improvement in Lucene/Solr master (Java 11) to detect 
exit of JVM in our test framework. There are other places in Lucene that use 
ineffective ways to inspect the stack trace.
   
   This one optimizes the implementation of 
TestSecurityManager#checkExit(status) to disallow all JVM exits outside of the 
official test runner by using StackWalker. In addition this needs no additional 
permissions, because we do not instruct StackWalker to fetch all crazy stuff 
like Class instances of stack elements.
   
   The way how this works is: Walk through stack trace:
   
   - skip all internal frames (those which come before the actual exit call)
   - skip all frmes with the actual exit call
   - limit to one more frame (the method calling System.exit())
   - check if that remaining frame is on our whitelist
   
   This can only be commited to master (9.0), as it requires Java 9.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9109) Use Java 9+ StackWalker to implement TestSecurityManager's detection of JVM exit

2019-12-23 Thread Uwe Schindler (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002443#comment-17002443
 ] 

Uwe Schindler commented on LUCENE-9109:
---

We should look at other places calling Thread.currentThread().getStackTrace or 
throws an exception just to get a stack trace.

> Use Java 9+ StackWalker to implement TestSecurityManager's detection of JVM 
> exit
> 
>
> Key: LUCENE-9109
> URL: https://issues.apache.org/jira/browse/LUCENE-9109
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/test-framework
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
>Priority: Major
> Fix For: master (9.0)
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is just a small improvement in Lucene/Solr master (Java 11) to detect 
> exit of JVM in our test framework. There are other places in Lucene that use 
> ineffective ways to inspect the stack trace.
> This one optimizes the implementation of 
> TestSecurityManager#checkExit(status) to disallow all JVM exits outside of 
> the official test runner by using StackWalker. In addition this needs no 
> additional permissions, because we do not instruct StackWalker to fetch all 
> crazy stuff like Class instances of stack elements.
> The way how this works is: Walk through stack trace:
> - skip all internal frames (those which come before the actual exit call)
> - skip all frmes with the actual exit call
> - limit to one more frame (the method calling System.exit())
> - check if that remaining frame is on our whitelist
> This can only be commited to master (9.0), as it requires Java 9.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-14143) Add request-logging to securing solr page

2019-12-23 Thread Robert Muir (Jira)

Robert Muir created SOLR-14143:
--

 Summary: Add request-logging to securing solr page
 Key: SOLR-14143
 URL: https://issues.apache.org/jira/browse/SOLR-14143
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Robert Muir


This functionality was cleaned up in SOLR-14138 and for a major release I've 
proposed to turn it on by default in SOLR-14142.

But for now, I think the "securing solr" page should instruct how to turn this 
on. Hopefully if we fix the default in SOLR-14142, this paragraph can simply go 
away (I think it is expert to not want to log such a basic thing).

There is some overlap with "audit logging", but for sure the request log is 
always more complete, since it logs things that never even make it to solr (as 
well as 4xx denied by solr itself, of course). You can see the differenes by 
running a simple nmap script scan of your solr instance or similar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14143) Add request-logging to securing solr page

2019-12-23 Thread Robert Muir (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002448#comment-17002448
 ] 

Robert Muir commented on SOLR-14143:


Simple patch: since it isn't default, I really want to keep it short and just 
make sure people are aware of it.

> Add request-logging to securing solr page
> -
>
> Key: SOLR-14143
> URL: https://issues.apache.org/jira/browse/SOLR-14143
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Robert Muir
>Priority: Major
> Attachments: SOLR-14143.patch
>
>
> This functionality was cleaned up in SOLR-14138 and for a major release I've 
> proposed to turn it on by default in SOLR-14142.
> But for now, I think the "securing solr" page should instruct how to turn 
> this on. Hopefully if we fix the default in SOLR-14142, this paragraph can 
> simply go away (I think it is expert to not want to log such a basic thing).
> There is some overlap with "audit logging", but for sure the request log is 
> always more complete, since it logs things that never even make it to solr 
> (as well as 4xx denied by solr itself, of course). You can see the differenes 
> by running a simple nmap script scan of your solr instance or similar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14143) Add request-logging to securing solr page

2019-12-23 Thread Robert Muir (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated SOLR-14143:
---
Attachment: SOLR-14143.patch

> Add request-logging to securing solr page
> -
>
> Key: SOLR-14143
> URL: https://issues.apache.org/jira/browse/SOLR-14143
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Robert Muir
>Priority: Major
> Attachments: SOLR-14143.patch
>
>
> This functionality was cleaned up in SOLR-14138 and for a major release I've 
> proposed to turn it on by default in SOLR-14142.
> But for now, I think the "securing solr" page should instruct how to turn 
> this on. Hopefully if we fix the default in SOLR-14142, this paragraph can 
> simply go away (I think it is expert to not want to log such a basic thing).
> There is some overlap with "audit logging", but for sure the request log is 
> always more complete, since it logs things that never even make it to solr 
> (as well as 4xx denied by solr itself, of course). You can see the differenes 
> by running a simple nmap script scan of your solr instance or similar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] uschindler commented on issue #1114: LUCENE-9109: Use stack walker to implement TestSecurityManager's detection of test JVM exit

2019-12-23 Thread GitBox

uschindler commented on issue #1114: LUCENE-9109: Use stack walker to implement 
TestSecurityManager's detection of test JVM exit
URL: https://github.com/apache/lucene-solr/pull/1114#issuecomment-568567830
 
 
   You are right:
   
   > If there is a security manager, and this thread is not the current thread, 
then the security manager's checkPermission method is called with a 
RuntimePermission("getStackTrace") permission to see if it's ok to get the 
stack trace.
   
   As we were only looking at current thread it was ok. Not sure why we had the 
permission stuff.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13817) Deprecate and remove legacy SolrCache implementations

2019-12-23 Thread David Smiley (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002460#comment-17002460
 ] 

David Smiley commented on SOLR-13817:
-

I'm looking at our {{_default}} configSet and I see class= all over the place 
for the caches.  Shouldn't they have been removed?

> Deprecate and remove legacy SolrCache implementations
> -
>
> Key: SOLR-13817
> URL: https://issues.apache.org/jira/browse/SOLR-13817
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Major
> Fix For: master (9.0), 8.4
>
> Attachments: SOLR-13817-8x.patch, SOLR-13817-master.patch
>
>
> Now that SOLR-8241 has been committed I propose to deprecate other cache 
> implementations in 8x and remove them altogether from 9.0, in order to reduce 
> confusion and maintenance costs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] uschindler edited a comment on issue #1114: LUCENE-9109: Use stack walker to implement TestSecurityManager's detection of test JVM exit

2019-12-23 Thread GitBox

uschindler edited a comment on issue #1114: LUCENE-9109: Use stack walker to 
implement TestSecurityManager's detection of test JVM exit
URL: https://github.com/apache/lucene-solr/pull/1114#issuecomment-568567830
 
 
   You are right:
   
   > If there is a security manager, and this thread is not the current thread, 
then the security manager's checkPermission method is called with a 
RuntimePermission("getStackTrace") permission to see if it's ok to get the 
stack trace.
   
   As we were only looking at current thread it was useless to have the 
privileged context. Not sure why we had the permission stuff.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] andywebb1975 commented on issue #1113: SOLR-14131: adds maxQueryLength option

2019-12-23 Thread GitBox

andywebb1975 commented on issue #1113: SOLR-14131: adds maxQueryLength option
URL: https://github.com/apache/lucene-solr/pull/1113#issuecomment-568568657
 
 
   Thanks Bruno and Mike - I've submitted some updates, could you take another 
look please?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] andywebb1975 commented on a change in pull request #1113: SOLR-14131: adds maxQueryLength option

2019-12-23 Thread GitBox

andywebb1975 commented on a change in pull request #1113: SOLR-14131: adds 
maxQueryLength option
URL: https://github.com/apache/lucene-solr/pull/1113#discussion_r360989610
 
 

 ##
 File path: 
solr/core/src/test/org/apache/solr/spelling/DirectSolrSpellCheckerTest.java
 ##
 @@ -88,6 +88,45 @@ public void testOnlyMorePopularWithExtendedResults() throws 
Exception {
 
"//lst[@name='spellcheck']/lst[@name='suggestions']/lst[@name='fox']/arr[@name='suggestion']/lst/int[@name='freq']=2",
 "//lst[@name='spellcheck']/bool[@name='correctlySpelled']='true'"
 );
-  }  
+  }
+
+  @Test
+  public void testMaxQueryLength() throws Exception {
+testMaxQueryLength(true);
+testMaxQueryLength(false);
+  }
+
+  private void testMaxQueryLength(Boolean limitQueryLength) throws Exception {
+
+DirectSolrSpellChecker checker = new DirectSolrSpellChecker();
+NamedList spellchecker = new NamedList();
+spellchecker.add("classname", DirectSolrSpellChecker.class.getName());
+spellchecker.add(SolrSpellChecker.FIELD, "teststop");
+spellchecker.add(DirectSolrSpellChecker.MINQUERYLENGTH, 2);
+
+// demonstrate that "anothar" is not corrected when maxQueryLength is set 
to a small number
+if (limitQueryLength) 
spellchecker.add(DirectSolrSpellChecker.MAXQUERYLENGTH, 4);
+
+SolrCore core = h.getCore();
+checker.init(spellchecker, core);
+
+h.getCore().withSearcher(searcher -> {
+  Collection tokens = queryConverter.convert("anothar");
+  SpellingOptions spellOpts = new SpellingOptions(tokens, 
searcher.getIndexReader());
+  SpellingResult result = checker.getSuggestions(spellOpts);
+  assertTrue("result should not be null", result != null);
+  Map suggestions = result.get(tokens.iterator().next());
+  assertTrue("suggestions should not be null", suggestions != null);
+
+  if (limitQueryLength) {
+assertTrue("suggestions should be empty", suggestions.isEmpty());
+  } else {
+Map.Entry entry = 
suggestions.entrySet().iterator().next();
 
 Review comment:
   done!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] andywebb1975 commented on a change in pull request #1113: SOLR-14131: adds maxQueryLength option

2019-12-23 Thread GitBox

andywebb1975 commented on a change in pull request #1113: SOLR-14131: adds 
maxQueryLength option
URL: https://github.com/apache/lucene-solr/pull/1113#discussion_r360989706
 
 

 ##
 File path: 
solr/core/src/test/org/apache/solr/spelling/DirectSolrSpellCheckerTest.java
 ##
 @@ -79,7 +79,7 @@ public void test() throws Exception {
   return null;
 });
   }
-  
+
 
 Review comment:
   I think it's clearer what's going on there now!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] andywebb1975 commented on a change in pull request #1113: SOLR-14131: adds maxQueryLength option

2019-12-23 Thread GitBox

andywebb1975 commented on a change in pull request #1113: SOLR-14131: adds 
maxQueryLength option
URL: https://github.com/apache/lucene-solr/pull/1113#discussion_r360990276
 
 

 ##
 File path: 
solr/core/src/test/org/apache/solr/spelling/DirectSolrSpellCheckerTest.java
 ##
 @@ -88,6 +88,45 @@ public void testOnlyMorePopularWithExtendedResults() throws 
Exception {
 
"//lst[@name='spellcheck']/lst[@name='suggestions']/lst[@name='fox']/arr[@name='suggestion']/lst/int[@name='freq']=2",
 "//lst[@name='spellcheck']/bool[@name='correctlySpelled']='true'"
 );
-  }  
+  }
+
+  @Test
+  public void testMaxQueryLength() throws Exception {
+testMaxQueryLength(true);
+testMaxQueryLength(false);
+  }
+
+  private void testMaxQueryLength(Boolean limitQueryLength) throws Exception {
+
+DirectSolrSpellChecker checker = new DirectSolrSpellChecker();
+NamedList spellchecker = new NamedList();
 
 Review comment:
   I've just used what you suggested here - am not too familiar with how this 
works.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-13890) Add postfilter support to {!terms} queries

2019-12-23 Thread Joel Bernstein (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002424#comment-17002424
 ] 

Joel Bernstein edited comment on SOLR-13890 at 12/23/19 8:23 PM:
-

The other really big aspect of this is caching.

Even though the scorer based filter can be fast if it's applied with the main 
query, in Solr that's not going to happen.

The reason is the filter cache. Solr will apply the filter against the *entire 
index* and create a DocSet to cache. This will be slow compared to the 
postfilter if the number of search results is small relative to the size of the 
index. Which might be acceptable if the filter cache provided a big advantage 
on subsequent requests. But ... 

Solr's  filter cache is top level so it gets dumped after a single document is 
loaded. So in scenarios where there is lot's of indexing going on the filter 
cache becomes problematic. 

There are ways around this issue, like turning off caching using local params, 
or not using filter queries. But these approaches are not what users typically 
do with a filter.

So, the postfilters behavior (not cached in filter cache) provides the best 
solution for certain situations where the filter cache is problematic.

 

 


was (Author: joel.bernstein):
The other really big aspect of this is caching.

Even though the scorer based filter can be fast if it's applied with the main 
query, in Solr that's not going to happen.

The reason is the filter cache. Solr will apply the filter against the entire 
index and create a DocSet to cache. 

Our filter cache is top level so it gets dumped after a single document is 
loaded. So in scenarios where there is lot's of indexing going on the filter 
cache becomes problematic. 

There are ways around this issue, like turning off caching using local params, 
or not using filter queries. But these approaches are not what users typically 
do with a filter.

So, the postfilters behavior (not cached in filter cache) provides the best 
solution for certain situations where the filter cache is problematic.

 

 

> Add postfilter support to {!terms} queries
> --
>
> Key: SOLR-13890
> URL: https://issues.apache.org/jira/browse/SOLR-13890
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query parsers
>Affects Versions: master (9.0)
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Major
> Attachments: SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch
>
>
> There are some use-cases where it'd be nice if the "terms" qparser created a 
> query that could be run as a postfilter.  Particularly, when users are 
> checking for hundreds or thousands of terms, a postfilter implementation can 
> be more performant than the standard processing.
> WIth this issue, I'd like to propose a post-filter implementation for the 
> {{docValuesTermsFilter}} "method".  Postfilter creation can use a 
> SortedSetDocValues object to populate a DV bitset with the "terms" being 
> checked for.  Each document run through the post-filter can look at their 
> doc-values for the field in question and check them efficiently against the 
> constructed bitset.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-13890) Add postfilter support to {!terms} queries

2019-12-23 Thread Joel Bernstein (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002424#comment-17002424
 ] 

Joel Bernstein edited comment on SOLR-13890 at 12/23/19 8:24 PM:
-

The other really big aspect of this is caching.

Even though the scorer based filter can be fast if it's applied with the main 
query, in Solr that's not going to happen.

The reason is the filter cache. Solr will apply the filter against the *entire 
index* and create a DocSet to cache. This will be slow compared to the 
postfilter if the number of search results is small relative to the size of the 
index. Which might be acceptable if the filter cache provided a big advantage 
on subsequent requests. But ... 

Solr's filter cache is top level so it gets dumped after a single document is 
loaded. So in scenarios where there is lot's of indexing going on the filter 
cache becomes problematic. 

There are ways around this issue, like turning off caching using local params, 
or not using filter queries. But these approaches are not what users typically 
do with a filter.

So, the postfilters behavior (not cached in filter cache) provides the best 
solution for certain situations where the filter cache is problematic.

 

 


was (Author: joel.bernstein):
The other really big aspect of this is caching.

Even though the scorer based filter can be fast if it's applied with the main 
query, in Solr that's not going to happen.

The reason is the filter cache. Solr will apply the filter against the *entire 
index* and create a DocSet to cache. This will be slow compared to the 
postfilter if the number of search results is small relative to the size of the 
index. Which might be acceptable if the filter cache provided a big advantage 
on subsequent requests. But ... 

Solr's  filter cache is top level so it gets dumped after a single document is 
loaded. So in scenarios where there is lot's of indexing going on the filter 
cache becomes problematic. 

There are ways around this issue, like turning off caching using local params, 
or not using filter queries. But these approaches are not what users typically 
do with a filter.

So, the postfilters behavior (not cached in filter cache) provides the best 
solution for certain situations where the filter cache is problematic.

 

 

> Add postfilter support to {!terms} queries
> --
>
> Key: SOLR-13890
> URL: https://issues.apache.org/jira/browse/SOLR-13890
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query parsers
>Affects Versions: master (9.0)
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Major
> Attachments: SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch
>
>
> There are some use-cases where it'd be nice if the "terms" qparser created a 
> query that could be run as a postfilter.  Particularly, when users are 
> checking for hundreds or thousands of terms, a postfilter implementation can 
> be more performant than the standard processing.
> WIth this issue, I'd like to propose a post-filter implementation for the 
> {{docValuesTermsFilter}} "method".  Postfilter creation can use a 
> SortedSetDocValues object to populate a DV bitset with the "terms" being 
> checked for.  Each document run through the post-filter can look at their 
> doc-values for the field in question and check them efficiently against the 
> constructed bitset.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14141) eliminate JKS keystore from solr SSL docs

2019-12-23 Thread Robert Muir (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002488#comment-17002488
 ] 

Robert Muir commented on SOLR-14141:


The funniest part about this is that this step 1 is really creating a pkcs12 
keystore. It is in fact not jks :)
And the next step 2 that "converts" it is just converting pkcs12 <-> pkcs12.

This craziness currently works because of how java's default security config is 
defined:

{noformat}
#
# Default keystore type.
#
keystore.type=pkcs12

#
# Controls compatibility mode for JKS and PKCS12 keystore types.
#
# When set to 'true', both JKS and PKCS12 keystore types support loading
# keystore files in either JKS or PKCS12 format. When set to 'false' the
# JKS keystore type supports loading only JKS keystore files and the PKCS12
# keystore type supports loading only PKCS12 keystore files.
#
keystore.type.compat=true
{noformat}


> eliminate JKS keystore from solr SSL docs
> -
>
> Key: SOLR-14141
> URL: https://issues.apache.org/jira/browse/SOLR-14141
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Robert Muir
>Priority: Major
>
> On the "Enabling SSL" page: 
> https://lucene.apache.org/solr/guide/8_3/enabling-ssl.html#enabling-ssl
> The first step is currently to create a JKS keystore. The next step 
> immediately converts the JKS keystore into PKCS12, so that openssl can then 
> be used to extract key material in PEM format for use with curl.
> Now that PKCS12 is java's default keystore format, why not omit step 1 
> entirely? What am I missing? PKCS12 is a more commonly 
> understood/standardized format.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] uschindler commented on issue #1114: LUCENE-9109: Use stack walker to implement TestSecurityManager's detection of test JVM exit

2019-12-23 Thread GitBox

uschindler commented on issue #1114: LUCENE-9109: Use stack walker to implement 
TestSecurityManager's detection of test JVM exit
URL: https://github.com/apache/lucene-solr/pull/1114#issuecomment-568582551
 
 
   I just changed the static final predicate to a static method.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14141) eliminate JKS keystore from solr SSL docs

2019-12-23 Thread Uwe Schindler (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002492#comment-17002492
 ] 

Uwe Schindler commented on SOLR-14141:
--

FYI, it was always possible to run jetty with a p12 keystore. I ran Mortbay 
Jetty 10 years ago using a simple p12 file.

> eliminate JKS keystore from solr SSL docs
> -
>
> Key: SOLR-14141
> URL: https://issues.apache.org/jira/browse/SOLR-14141
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Robert Muir
>Priority: Major
>
> On the "Enabling SSL" page: 
> https://lucene.apache.org/solr/guide/8_3/enabling-ssl.html#enabling-ssl
> The first step is currently to create a JKS keystore. The next step 
> immediately converts the JKS keystore into PKCS12, so that openssl can then 
> be used to extract key material in PEM format for use with curl.
> Now that PKCS12 is java's default keystore format, why not omit step 1 
> entirely? What am I missing? PKCS12 is a more commonly 
> understood/standardized format.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14141) eliminate JKS keystore from solr SSL docs

2019-12-23 Thread Robert Muir (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002493#comment-17002493
 ] 

Robert Muir commented on SOLR-14141:


Yes possible, but the defaults/compat change described here looks like it 
happened in java 8: https://openjdk.java.net/jeps/166

So we can easily simplify. And if someone really does have an ancient jks 
keystore, it is no problem, even if we tell java wrongly that it is infact 
pkcs12. 

We are doing that already today in the opposite fashion (telling java the thing 
is JKS format, but in reality its pkcs12)

> eliminate JKS keystore from solr SSL docs
> -
>
> Key: SOLR-14141
> URL: https://issues.apache.org/jira/browse/SOLR-14141
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Robert Muir
>Priority: Major
>
> On the "Enabling SSL" page: 
> https://lucene.apache.org/solr/guide/8_3/enabling-ssl.html#enabling-ssl
> The first step is currently to create a JKS keystore. The next step 
> immediately converts the JKS keystore into PKCS12, so that openssl can then 
> be used to extract key material in PEM format for use with curl.
> Now that PKCS12 is java's default keystore format, why not omit step 1 
> entirely? What am I missing? PKCS12 is a more commonly 
> understood/standardized format.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] megancarey opened a new pull request #1115: SOLR-13101: Fix the gson version reference

2019-12-23 Thread GitBox

megancarey opened a new pull request #1115: SOLR-13101: Fix the gson version 
reference
URL: https://github.com/apache/lucene-solr/pull/1115
 
 
   
   
   
   # Description
   
   Please provide a short description of the changes you're making with this 
pull request.
   
   # Solution
   
   Please provide a short description of the approach taken to implement your 
solution.
   
   # Tests
   
   Please describe the tests you've developed or run to confirm this patch 
implements the feature or solves the problem.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [x] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [ ] I have developed this patch against the `master` branch.
   - [ ] I have run `ant precommit` and the appropriate test suite.
   - [ ] I have added tests for my changes.
   - [ ] I have added documentation for the [Ref 
Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) 
(for Solr changes only).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] tflobbe opened a new pull request #1116: SOLR-14135: Utils.toJavabin returns a byte[] instead of InputStream

2019-12-23 Thread GitBox

tflobbe opened a new pull request #1116: SOLR-14135: Utils.toJavabin returns a 
byte[] instead of InputStream
URL: https://github.com/apache/lucene-solr/pull/1116
 
 
   I'm not too convinced about this PR honestly, I started thinking in doing 
this mostly because in the 8x branch we can't use InputStream's 
`readAllBytes();` method, but this may actually hurt future consumers of this 
method, if they don't need to read all bytes at once. I'll leave this PR for 
now, worst case I'll keep the tests.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-9093) Unified highlighter with word separator never gives context to the left

2019-12-23 Thread David Smiley (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated LUCENE-9093:
-
Status: Patch Available  (was: Open)

> Unified highlighter with word separator never gives context to the left
> ---
>
> Key: LUCENE-9093
> URL: https://issues.apache.org/jira/browse/LUCENE-9093
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/highlighter
>Reporter: Tim Retout
>Priority: Major
> Attachments: LUCENE-9093.patch
>
>
> When using the unified highlighter with hl.bs.type=WORD, I am not able to get 
> context to the left of the matches returned; only words to the right of each 
> match are shown.  I see this behaviour on both Solr 6.4 and Solr 7.1.
> Without context to the left of a match, the highlighted snippets are much 
> less useful for understanding where the match appears in a document.
> As an example, using the techproducts data with Solr 7.1, given a search for 
> "apple", highlighting the "features" field:
> http://localhost:8983/solr/techproducts/select?hl.fl=features&hl=on&q=apple&hl.bs.type=WORD&hl.fragsize=30&hl.method=unified
> I see this snippet:
> "Apple Lossless, H.264 video"
> Note that "Apple" is anchored to the left.  Compare with the original 
> highlighter:
> http://localhost:8983/solr/techproducts/select?hl.fl=features&hl=on&q=apple&hl.fragsize=30
> And the match has context either side:
> ", Audible, Apple Lossless, H.264 video"
> (To complicate this, in general I am not sure that the unified highlighter is 
> respecting the hl.fragsize parameter, although [SOLR-9935] suggests support 
> was added.  I included the hl.fragsize param in the unified URL too, but it's 
> making no difference unless set to 0.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9093) Unified highlighter with word separator never gives context to the left

2019-12-23 Thread David Smiley (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002506#comment-17002506
 ] 

David Smiley commented on LUCENE-9093:
--

Sorry for the delay.  Can you please post a PR as it's more conducive to the 
code review process?

I have a question about this setting.  You've declared the benefits of it for a 
{{hl.bs.type=WORD}} but would this also be helpful for SENTENCE too?  I hope so.

I think in 9.0 the {{hl.fragalign}} setting should default to {{0.5}} or maybe 
{{0.25}}

> Unified highlighter with word separator never gives context to the left
> ---
>
> Key: LUCENE-9093
> URL: https://issues.apache.org/jira/browse/LUCENE-9093
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/highlighter
>Reporter: Tim Retout
>Priority: Major
> Attachments: LUCENE-9093.patch
>
>
> When using the unified highlighter with hl.bs.type=WORD, I am not able to get 
> context to the left of the matches returned; only words to the right of each 
> match are shown.  I see this behaviour on both Solr 6.4 and Solr 7.1.
> Without context to the left of a match, the highlighted snippets are much 
> less useful for understanding where the match appears in a document.
> As an example, using the techproducts data with Solr 7.1, given a search for 
> "apple", highlighting the "features" field:
> http://localhost:8983/solr/techproducts/select?hl.fl=features&hl=on&q=apple&hl.bs.type=WORD&hl.fragsize=30&hl.method=unified
> I see this snippet:
> "Apple Lossless, H.264 video"
> Note that "Apple" is anchored to the left.  Compare with the original 
> highlighter:
> http://localhost:8983/solr/techproducts/select?hl.fl=features&hl=on&q=apple&hl.fragsize=30
> And the match has context either side:
> ", Audible, Apple Lossless, H.264 video"
> (To complicate this, in general I am not sure that the unified highlighter is 
> respecting the hl.fragsize parameter, although [SOLR-9935] suggests support 
> was added.  I included the hl.fragsize param in the unified URL too, but it's 
> making no difference unless set to 0.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13101) Shared storage support in SolrCloud

2019-12-23 Thread Noble Paul (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002508#comment-17002508
 ] 

Noble Paul commented on SOLR-13101:
---

I would love to see a few more details

 

Is it a standard Solr plugin that I can define in solrconfig.xml? Can it be 
configured through remote API?
If yes? which one
If not, let's have a separate discussion

What are the public touch points? 
* remote APIs
* configurations
* files created/used in ZK/filesystem

We need to make every new addition to Solr easily digestible to a casual 
observer.



> Shared storage support in SolrCloud
> ---
>
> Key: SOLR-13101
> URL: https://issues.apache.org/jira/browse/SOLR-13101
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: Yonik Seeley
>Priority: Major
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> Solr should have first-class support for shared storage (blob/object stores 
> like S3, google cloud storage, etc. and shared filesystems like HDFS, NFS, 
> etc).
> The key component will likely be a new replica type for shared storage.  It 
> would have many of the benefits of the current "pull" replicas (not indexing 
> on all replicas, all shards identical with no shards getting out-of-sync, 
> etc), but would have additional benefits:
>  - Any shard could become leader (the blob store always has the index)
>  - Better elasticity scaling down
>- durability not linked to number of replcias.. a single replica could be 
> common for write workloads
>- could drop to 0 replicas for a shard when not needed (blob store always 
> has index)
>  - Allow for higher performance write workloads by skipping the transaction 
> log
>- don't pay for what you don't need
>- a commit will be necessary to flush to stable storage (blob store)
>  - A lot of the complexity and failure modes go away
> An additional component a Directory implementation that will work well with 
> blob stores.  We probably want one that treats local disk as a cache since 
> the latency to remote storage is so large.  I think there are still some 
> "locking" issues to be solved here (ensuring that more than one writer to the 
> same index won't corrupt it).  This should probably be pulled out into a 
> different JIRA issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9091) UnifiedHighlighter HTML escaping should only escape essentials

2019-12-23 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002517#comment-17002517
 ] 

ASF subversion and git services commented on LUCENE-9091:
-

Commit 1be5b689640fe4d1bf0ae3fd19c5fe93b20a77ef in lucene-solr's branch 
refs/heads/master from Nándor Mátravölgyi
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=1be5b68 ]

LUCENE-9091: UnifiedHighlighter HTML escaping should only
 escape essentials


> UnifiedHighlighter HTML escaping should only escape essentials
> --
>
> Key: LUCENE-9091
> URL: https://issues.apache.org/jira/browse/LUCENE-9091
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/highlighter
>Reporter: Nándor Mátravölgyi
>Assignee: David Smiley
>Priority: Minor
> Attachments: LUCENE-9091.patch
>
>
> The unified highlighter does not use the 
> *org.apache.lucene.search.highlight.SimpleHTMLEncoder* through 
> *org.apache.solr.highlight.HtmlEncoder*. It has the HTML escaping feature 
> re-implemented and embedded in the 
> *org.apache.lucene.search.uhighlight.DefaultPassageFormatter*.
> The HTML escaping done by the unified highlighter escapes characters that do 
> not need it. This makes the result payload 50%+ more heavy with no benefit.
> Here is a highlight snippet using the original highlighter:
> {noformat}
> A filter that stems words using a Snowball-generated stemmer. 
> Available stemmers & x are listed in org.tartarus.snowball.ext. Note: 
> This filter is aware of the KeywordAttribute.
> {noformat}
> Here is the same highlight snippet using the unified highlighter:
> {noformat}
> A filter that stems words using a Snowball-generated stemmer. Available stemmers & x are listed in org.tartarus.snowball.ext. Note: This filter is aware of the KeywordAttribute.
> {noformat}
> Maybe I'm missing the point why this is done the way it is. If this behaviour 
> is desired for some use-case it should be a separate encoder, and the HTML 
> encoder should only escape the necessary characters.
> Affects all versions of Lucene-Solr since the addition of the 
> UnifiedHighlighter. Here are the lines where the escaping are implemented 
> differently:
>  * [Escaping by the unified 
> highlighter|https://github.com/apache/lucene-solr/blob/2387bb9d60ae44eeeb4fbcb2f2877f46be5303a0/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/DefaultPassageFormatter.java#L132]
>  * [Escaping by the other 
> highlighters|https://github.com/apache/lucene-solr/blob/2387bb9d60ae44eeeb4fbcb2f2877f46be5303a0/lucene/highlighter/src/java/org/apache/lucene/search/highlight/SimpleHTMLEncoder.java#L69]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9091) UnifiedHighlighter HTML escaping should only escape essentials

2019-12-23 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002519#comment-17002519
 ] 

ASF subversion and git services commented on LUCENE-9091:
-

Commit 80ad056babe577a63edf81f71d3fe525124ff43a in lucene-solr's branch 
refs/heads/branch_8x from Nándor Mátravölgyi
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=80ad056 ]

LUCENE-9091: UnifiedHighlighter HTML escaping should only
 escape essentials

(cherry picked from commit 1be5b689640fe4d1bf0ae3fd19c5fe93b20a77ef)


> UnifiedHighlighter HTML escaping should only escape essentials
> --
>
> Key: LUCENE-9091
> URL: https://issues.apache.org/jira/browse/LUCENE-9091
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/highlighter
>Reporter: Nándor Mátravölgyi
>Assignee: David Smiley
>Priority: Minor
> Attachments: LUCENE-9091.patch
>
>
> The unified highlighter does not use the 
> *org.apache.lucene.search.highlight.SimpleHTMLEncoder* through 
> *org.apache.solr.highlight.HtmlEncoder*. It has the HTML escaping feature 
> re-implemented and embedded in the 
> *org.apache.lucene.search.uhighlight.DefaultPassageFormatter*.
> The HTML escaping done by the unified highlighter escapes characters that do 
> not need it. This makes the result payload 50%+ more heavy with no benefit.
> Here is a highlight snippet using the original highlighter:
> {noformat}
> A filter that stems words using a Snowball-generated stemmer. 
> Available stemmers & x are listed in org.tartarus.snowball.ext. Note: 
> This filter is aware of the KeywordAttribute.
> {noformat}
> Here is the same highlight snippet using the unified highlighter:
> {noformat}
> A filter that stems words using a Snowball-generated stemmer. Available stemmers & x are listed in org.tartarus.snowball.ext. Note: This filter is aware of the KeywordAttribute.
> {noformat}
> Maybe I'm missing the point why this is done the way it is. If this behaviour 
> is desired for some use-case it should be a separate encoder, and the HTML 
> encoder should only escape the necessary characters.
> Affects all versions of Lucene-Solr since the addition of the 
> UnifiedHighlighter. Here are the lines where the escaping are implemented 
> differently:
>  * [Escaping by the unified 
> highlighter|https://github.com/apache/lucene-solr/blob/2387bb9d60ae44eeeb4fbcb2f2877f46be5303a0/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/DefaultPassageFormatter.java#L132]
>  * [Escaping by the other 
> highlighters|https://github.com/apache/lucene-solr/blob/2387bb9d60ae44eeeb4fbcb2f2877f46be5303a0/lucene/highlighter/src/java/org/apache/lucene/search/highlight/SimpleHTMLEncoder.java#L69]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14095) Remove serialization and/or support serialization filtering

2019-12-23 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002528#comment-17002528
 ] 

ASF subversion and git services commented on SOLR-14095:


Commit 5f5ef58117578045de3798dd487b89246c15a23b in lucene-solr's branch 
refs/heads/branch_8x from Tomas Eduardo Fernandez Lobbe
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5f5ef58 ]

SOLR-14095: Fix Java 8 compile issue


> Remove serialization and/or support serialization filtering
> ---
>
> Key: SOLR-14095
> URL: https://issues.apache.org/jira/browse/SOLR-14095
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Robert Muir
>Priority: Major
> Attachments: SOLR-14095-json.patch, json-nl.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Removing the use of serialization is greatly preferred.
> But if serialization over the wire must really happen, then we must use JDK's 
> serialization filtering capability to prevent havoc.
> https://docs.oracle.com/javase/10/core/serialization-filtering1.htm#JSCOR-GUID-3ECB288D-E5BD-4412-892F-E9BB11D4C98A



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14095) Remove serialization and/or support serialization filtering

2019-12-23 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002527#comment-17002527
 ] 

ASF subversion and git services commented on SOLR-14095:


Commit fe04a5b6f0a5ea3c8d1d2675d12740d299d1c4b0 in lucene-solr's branch 
refs/heads/branch_8x from Tomas Eduardo Fernandez Lobbe
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=fe04a5b ]

SOLR-14095: Let the overseer use javabin to store responses in ZooKeeper (#1095)

The Overseer used java serialization to store command responses in ZooKeeper. 
This commit changes the code to use Javabin instead, while allowing Java 
serialization with a System property in case it's needed for compatibility


> Remove serialization and/or support serialization filtering
> ---
>
> Key: SOLR-14095
> URL: https://issues.apache.org/jira/browse/SOLR-14095
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Robert Muir
>Priority: Major
> Attachments: SOLR-14095-json.patch, json-nl.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Removing the use of serialization is greatly preferred.
> But if serialization over the wire must really happen, then we must use JDK's 
> serialization filtering capability to prevent havoc.
> https://docs.oracle.com/javase/10/core/serialization-filtering1.htm#JSCOR-GUID-3ECB288D-E5BD-4412-892F-E9BB11D4C98A



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] noblepaul commented on a change in pull request #1109: More pervasive use of PackageLoader / PluginInfo

2019-12-23 Thread GitBox

noblepaul commented on a change in pull request #1109: More pervasive use of 
PackageLoader / PluginInfo
URL: https://github.com/apache/lucene-solr/pull/1109#discussion_r361021908
 
 

 ##
 File path: solr/core/src/java/org/apache/solr/core/SolrResourceLoader.java
 ##
 @@ -954,4 +987,46 @@ public static void persistConfLocally(SolrResourceLoader 
loader, String resource
 }
   }
 
+  // TODO document these methods...
 
 Review comment:
   What is the motivation behind `SolrResourceLoader` returning packages?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] noblepaul commented on a change in pull request #1109: More pervasive use of PackageLoader / PluginInfo

2019-12-23 Thread GitBox

noblepaul commented on a change in pull request #1109: More pervasive use of 
PackageLoader / PluginInfo
URL: https://github.com/apache/lucene-solr/pull/1109#discussion_r361022358
 
 

 ##
 File path: solr/core/src/java/org/apache/solr/schema/IndexSchemaFactory.java
 ##
 @@ -62,7 +62,7 @@ public static IndexSchema buildIndexSchema(String 
resourceName, SolrConfig confi
 PluginInfo info = config.getPluginInfo(IndexSchemaFactory.class.getName());
 IndexSchemaFactory factory;
 if (null != info) {
-  factory = config.getResourceLoader().newInstance(info.className, 
IndexSchemaFactory.class);
+  factory = config.getResourceLoader().newInstance(info, 
IndexSchemaFactory.class);
 
 Review comment:
   Do we even support the `packageName:ClassName` scheme in `schema.xml` at 
all? How does it play with this?
   Need more discussions


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] noblepaul commented on a change in pull request #1109: More pervasive use of PackageLoader / PluginInfo

2019-12-23 Thread GitBox

noblepaul commented on a change in pull request #1109: More pervasive use of 
PackageLoader / PluginInfo
URL: https://github.com/apache/lucene-solr/pull/1109#discussion_r361020495
 
 

 ##
 File path: solr/core/src/java/org/apache/solr/core/SolrCore.java
 ##
 @@ -519,7 +518,7 @@ private IndexDeletionPolicyWrapper 
initDeletionPolicy(IndexDeletionPolicyWrapper
 final PluginInfo info = 
solrConfig.getPluginInfo(IndexDeletionPolicy.class.getName());
 final IndexDeletionPolicy delPolicy;
 if (info != null) {
-  delPolicy = createInstance(info.className, IndexDeletionPolicy.class, 
"Deletion Policy for SOLR", this, getResourceLoader());
+  delPolicy = newInstance(info, IndexDeletionPolicy.class, this, 
getResourceLoader());
 
 Review comment:
   Looks wrong. Shouldn't we get the correct Package classloader here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] noblepaul commented on a change in pull request #1109: More pervasive use of PackageLoader / PluginInfo

2019-12-23 Thread GitBox

noblepaul commented on a change in pull request #1109: More pervasive use of 
PackageLoader / PluginInfo
URL: https://github.com/apache/lucene-solr/pull/1109#discussion_r361018813
 
 

 ##
 File path: solr/core/src/java/org/apache/solr/core/DirectoryFactory.java
 ##
 @@ -420,7 +420,7 @@ static DirectoryFactory loadDirectoryFactory(SolrConfig 
config, CoreContainer cc
 final DirectoryFactory dirFactory;
 if (info != null) {
   log.debug(info.className);
-  dirFactory = config.getResourceLoader().newInstance(info.className, 
DirectoryFactory.class);
+  dirFactory = config.getResourceLoader().newInstance(info, 
DirectoryFactory.class);
 
 Review comment:
   Whats the point of this call? We should never use 
`config.getResourceLoader()`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] noblepaul commented on a change in pull request #1109: More pervasive use of PackageLoader / PluginInfo

2019-12-23 Thread GitBox

noblepaul commented on a change in pull request #1109: More pervasive use of 
PackageLoader / PluginInfo
URL: https://github.com/apache/lucene-solr/pull/1109#discussion_r361021700
 
 

 ##
 File path: solr/core/src/java/org/apache/solr/core/SolrResourceLoader.java
 ##
 @@ -529,6 +536,18 @@ public String resourceLocation(String resource) {
 
 Class clazz = null;
 try {
+  // If there is a package name prefix ...
+  Pair pkgClassPair = PluginInfo.parseClassName(cname);
+  PackageLoader.Package pkg = getPackage(pkgClassPair.first());
+  if (pkg == null) {
+// essentially, remove the package prefix and continue as normal.  
Maybe it'll be found.
+cname = pkgClassPair.second();
+  } else {
+// TODO what version?
 
 Review comment:
   Trying to load the latest always? Why? Need to revisit. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] noblepaul commented on a change in pull request #1109: More pervasive use of PackageLoader / PluginInfo

2019-12-23 Thread GitBox

noblepaul commented on a change in pull request #1109: More pervasive use of 
PackageLoader / PluginInfo
URL: https://github.com/apache/lucene-solr/pull/1109#discussion_r361019528
 
 

 ##
 File path: solr/core/src/java/org/apache/solr/core/PluginBag.java
 ##
 @@ -140,11 +139,10 @@ public static void initInstance(Object inst, PluginInfo 
info) {
   log.debug("{} : '{}' created with startup=lazy ", meta.getCleanTag(), 
info.name);
   return new LazyPluginHolder(meta, info, core, 
core.getResourceLoader(), false);
 } else {
-  if (info.pkgName != null) {
-PackagePluginHolder holder = new PackagePluginHolder<>(info, core, 
meta);
-return holder;
+  if (core.getResourceLoader().getPackage(info.pkgName) != null) {
 
 Review comment:
   Shouldn't we fail fast instead of continuing as if it there is no problem?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] noblepaul commented on a change in pull request #1109: More pervasive use of PackageLoader / PluginInfo

2019-12-23 Thread GitBox

noblepaul commented on a change in pull request #1109: More pervasive use of 
PackageLoader / PluginInfo
URL: https://github.com/apache/lucene-solr/pull/1109#discussion_r361021135
 
 

 ##
 File path: solr/core/src/java/org/apache/solr/core/SolrCore.java
 ##
 @@ -810,28 +809,30 @@ void initIndex(boolean passOnPreviousState, boolean 
reload) throws IOException {
* Creates an instance by trying a constructor that accepts a SolrCore before
* trying the default (no arg) constructor.
*
-   * @param className the instance class to create
+   * @param pluginInfo the instance class to create
* @param cast  the class or interface that the instance should extend 
or implement
-   * @param msg   a message helping compose the exception error if any 
occurs.
* @param core  The SolrCore instance for which this object needs to be 
loaded
* @return the desired instance
* @throws SolrException if the object could not be instantiated
*/
-  public static  T createInstance(String className, Class cast, String 
msg, SolrCore core, ResourceLoader resourceLoader) {
-Class clazz = null;
-if (msg == null) msg = "SolrCore Object";
+  public static  T newInstance(PluginInfo pluginInfo, Class cast, 
SolrCore core, SolrResourceLoader resourceLoader) {
+String msg = pluginInfo.type;
 try {
-  clazz = resourceLoader.findClass(className, cast);
-  //most of the classes do not have constructors which takes SolrCore 
argument. It is recommended to obtain SolrCore by implementing SolrCoreAware.
-  // So invariably always it will cause a  NoSuchMethodException. So 
iterate though the list of available constructors
-  Constructor[] cons = clazz.getConstructors();
-  for (Constructor con : cons) {
-Class[] types = con.getParameterTypes();
-if (types.length == 1 && types[0] == SolrCore.class) {
-  return cast.cast(con.newInstance(core));
+  //TODO separate out "core" scenario to another method
+  if (pluginInfo.pkgName == null && core != null) {
+Class clazz = 
resourceLoader.findClass(pluginInfo.className, cast);
+//most of the classes do not have constructors which takes SolrCore 
argument. It is recommended to obtain SolrCore by implementing SolrCoreAware.
 
 Review comment:
   This has nothing to do with package loader. Can be a separate ticket


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] noblepaul commented on a change in pull request #1109: More pervasive use of PackageLoader / PluginInfo

2019-12-23 Thread GitBox

noblepaul commented on a change in pull request #1109: More pervasive use of 
PackageLoader / PluginInfo
URL: https://github.com/apache/lucene-solr/pull/1109#discussion_r361021282
 
 

 ##
 File path: solr/core/src/java/org/apache/solr/core/SolrCore.java
 ##
 @@ -890,7 +897,7 @@ private UpdateHandler createReloadedUpdateHandler(String 
className, String msg,
   }
 
   private UpdateHandler createUpdateHandler(String className) {
-return createInstance(className, UpdateHandler.class, "Update Handler", 
this, getResourceLoader());
+return newInstance(new PluginInfo("updateHandler", className), 
UpdateHandler.class, this, getResourceLoader());
 
 Review comment:
   Again. No thought given to how it updates, if package is updated


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] noblepaul commented on a change in pull request #1109: More pervasive use of PackageLoader / PluginInfo

2019-12-23 Thread GitBox

noblepaul commented on a change in pull request #1109: More pervasive use of 
PackageLoader / PluginInfo
URL: https://github.com/apache/lucene-solr/pull/1109#discussion_r361017640
 
 

 ##
 File path: 
solr/contrib/velocity/src/java/org/apache/solr/response/VelocityResponseWriter.java
 ##
 @@ -275,7 +276,7 @@ private VelocityContext createContext(SolrQueryRequest 
request, SolrQueryRespons
 for (Map.Entry entry : customTools.entrySet()) {
   String name = entry.getKey();
   // TODO: at least log a warning when one of the *fixed* tools 
classes is same name with a custom one, currently silently ignored
-  Object customTool = SolrCore.createInstance(entry.getValue(), 
Object.class, "VrW custom tool: " + name, request.getCore(), 
request.getCore().getResourceLoader());
+  Object customTool = SolrCore.newInstance(new PluginInfo(name, 
entry.getValue()), Object.class, request.getCore(), 
request.getCore().getResourceLoader());
 
 Review comment:
   What is the purpose of this? Apparently this is not even using the right 
Classloader


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] noblepaul commented on a change in pull request #1109: More pervasive use of PackageLoader / PluginInfo

2019-12-23 Thread GitBox

noblepaul commented on a change in pull request #1109: More pervasive use of 
PackageLoader / PluginInfo
URL: https://github.com/apache/lucene-solr/pull/1109#discussion_r361021009
 
 

 ##
 File path: solr/core/src/java/org/apache/solr/core/SolrCore.java
 ##
 @@ -810,28 +809,30 @@ void initIndex(boolean passOnPreviousState, boolean 
reload) throws IOException {
* Creates an instance by trying a constructor that accepts a SolrCore before
* trying the default (no arg) constructor.
*
-   * @param className the instance class to create
+   * @param pluginInfo the instance class to create
* @param cast  the class or interface that the instance should extend 
or implement
-   * @param msg   a message helping compose the exception error if any 
occurs.
* @param core  The SolrCore instance for which this object needs to be 
loaded
* @return the desired instance
* @throws SolrException if the object could not be instantiated
*/
-  public static  T createInstance(String className, Class cast, String 
msg, SolrCore core, ResourceLoader resourceLoader) {
-Class clazz = null;
-if (msg == null) msg = "SolrCore Object";
+  public static  T newInstance(PluginInfo pluginInfo, Class cast, 
SolrCore core, SolrResourceLoader resourceLoader) {
+String msg = pluginInfo.type;
 try {
-  clazz = resourceLoader.findClass(className, cast);
-  //most of the classes do not have constructors which takes SolrCore 
argument. It is recommended to obtain SolrCore by implementing SolrCoreAware.
-  // So invariably always it will cause a  NoSuchMethodException. So 
iterate though the list of available constructors
-  Constructor[] cons = clazz.getConstructors();
-  for (Constructor con : cons) {
-Class[] types = con.getParameterTypes();
-if (types.length == 1 && types[0] == SolrCore.class) {
-  return cast.cast(con.newInstance(core));
+  //TODO separate out "core" scenario to another method
+  if (pluginInfo.pkgName == null && core != null) {
+Class clazz = 
resourceLoader.findClass(pluginInfo.className, cast);
 
 Review comment:
   The repeated pattern I see is, The patch only looks at how to load the class 
when the core is loaded. Pays no attention to how to reload it, if the package 
is updated.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

1 2 >

1 - 100 of 148 matches

Mail list logo