date:20200526

[jira] [Resolved] (SOLR-14512) Require java 8 upgrade

2020-05-26 Thread Jira



 [ 
https://issues.apache.org/jira/browse/SOLR-14512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl resolved SOLR-14512.

Resolution: Invalid

You filed the same question in two mailing lists, and got an answer on May 22nd:

{quote}
Hi Akhila,

Here is the related documentation: 
https://lucene.apache.org/solr/5_3_1/SYSTEM_REQUIREMENTS.html which says:

"Apache Solr runs of Java 7 or greater, Java 8 is verified to be compatible and 
may bring some performance improvements. When using Oracle Java 7 or OpenJDK 7, 
be sure to not use the GA build 147 or update versions u40, u45 and u51! We 
recommend using u55 or later."

Kind Regards,
Furkan KAMACI
{quote}

In the future, please ask questions on the *solr-user* mailing list only (you 
have to be subscribed to read the answers). This JIRA is not a support channel. 
Closing.

> Require java 8 upgrade
> --
>
> Key: SOLR-14512
> URL: https://issues.apache.org/jira/browse/SOLR-14512
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
> Environment: Production
>Reporter: Akhila John
>Priority: Critical
>
> Hi Team, 
> We use solr 5.3.1 for sitecore 8.2.
> We require to upgrade Java version to 'Java 8 Update 251' and remove / 
> Upgrade Wireshark to 3.2.3 in our application servers.
>  Could you please advise if this would have any impact on the solr. Does solr 
> 5.3.1 support Java 8.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-14513) can not compile with jdk1.8.0_20

2020-05-26 Thread aiqunfang (Jira)

aiqunfang created SOLR-14513:


 Summary: can not compile with jdk1.8.0_20
 Key: SOLR-14513
 URL: https://issues.apache.org/jira/browse/SOLR-14513
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: Build
Affects Versions: 8.5.1
Reporter: aiqunfang


while use jdk1.8.0_20 to compile by following command:

ant compile

 

the process is abort by compile error while compile the file:

solr-8.5.1\lucene\queries\src\java\org\apache\lucene\queries\intervals\ConjunctionDISI.java

at line 68:

 
64  private ConjunctionDISI(List iterators) {
65    assert iterators.size() >= 2;
66    // Sort the array the first time to allow the least frequent DocsEnum to
67    // lead the matching.
68   {color:#FF} CollectionUtil.timSort(iterators, 
Comparator.comparingLong(DocIdSetIterator::cost));{color}
69    lead1 = iterators.get(0);
70    lead2 = iterators.get(1);
71    others = iterators.subList(2, iterators.size()).toArray(new 
DocIdSetIterator[0]);
72  }
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14513) can not compile with jdk1.8.0_20

2020-05-26 Thread aiqunfang (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

aiqunfang updated SOLR-14513:
-
Description: 
while use jdk1.8.0_20 to compile by following command:

ant compile

 

the process is abort by compile error while compile the file:

solr-8.5.1\lucene\queries\src\java\org\apache\lucene\queries\intervals\ConjunctionDISI.java

at line 68:

 
 64  private ConjunctionDISI(List iterators) {
 65    assert iterators.size() >= 2;
 66    // Sort the array the first time to allow the least frequent DocsEnum to
 67    // lead the matching.
 68   {color:#ff} CollectionUtil.timSort(iterators, 
Comparator.comparingLong(DocIdSetIterator::cost));{color}
 69    lead1 = iterators.get(0);
 70    lead2 = iterators.get(1);
 71    others = iterators.subList(2, iterators.size()).toArray(new 
DocIdSetIterator[0]);
 72  }
  

 

and then, i setup the JDK 13, compile is uccess:

ant compile

  was:
while use jdk1.8.0_20 to compile by following command:

ant compile

 

the process is abort by compile error while compile the file:

solr-8.5.1\lucene\queries\src\java\org\apache\lucene\queries\intervals\ConjunctionDISI.java

at line 68:

 
64  private ConjunctionDISI(List iterators) {
65    assert iterators.size() >= 2;
66    // Sort the array the first time to allow the least frequent DocsEnum to
67    // lead the matching.
68   {color:#FF} CollectionUtil.timSort(iterators, 
Comparator.comparingLong(DocIdSetIterator::cost));{color}
69    lead1 = iterators.get(0);
70    lead2 = iterators.get(1);
71    others = iterators.subList(2, iterators.size()).toArray(new 
DocIdSetIterator[0]);
72  }
 


> can not compile with jdk1.8.0_20
> 
>
> Key: SOLR-14513
> URL: https://issues.apache.org/jira/browse/SOLR-14513
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Build
>Affects Versions: 8.5.1
>Reporter: aiqunfang
>Priority: Major
>
> while use jdk1.8.0_20 to compile by following command:
> ant compile
>  
> the process is abort by compile error while compile the file:
> solr-8.5.1\lucene\queries\src\java\org\apache\lucene\queries\intervals\ConjunctionDISI.java
> at line 68:
>  
>  64  private ConjunctionDISI(List iterators) {
>  65    assert iterators.size() >= 2;
>  66    // Sort the array the first time to allow the least frequent DocsEnum 
> to
>  67    // lead the matching.
>  68   {color:#ff} CollectionUtil.timSort(iterators, 
> Comparator.comparingLong(DocIdSetIterator::cost));{color}
>  69    lead1 = iterators.get(0);
>  70    lead2 = iterators.get(1);
>  71    others = iterators.subList(2, iterators.size()).toArray(new 
> DocIdSetIterator[0]);
>  72  }
>   
>  
> and then, i setup the JDK 13, compile is uccess:
> ant compile



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14513) can not compile with jdk1.8.0_20

2020-05-26 Thread aiqunfang (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

aiqunfang updated SOLR-14513:
-
Description: 
while use jdk1.8.0_20 to compile by following command:

ant compile

 

the process is abort by compile error while compile the file:

solr-8.5.1\lucene\queries\src\java\org\apache\lucene\queries\intervals\ConjunctionDISI.java

at line 68:

 
 64  private ConjunctionDISI(List iterators) {
 65    assert iterators.size() >= 2;
 66    // Sort the array the first time to allow the least frequent DocsEnum to
 67    // lead the matching.
 68   {color:#ff} CollectionUtil.timSort(iterators, 
Comparator.comparingLong(DocIdSetIterator::cost));{color}
 69    lead1 = iterators.get(0);
 70    lead2 = iterators.get(1);
 71    others = iterators.subList(2, iterators.size()).toArray(new 
DocIdSetIterator[0]);
 72  }
  

 

and then, i setup the JDK 13, compile is success:

ant compile

  was:
while use jdk1.8.0_20 to compile by following command:

ant compile

 

the process is abort by compile error while compile the file:

solr-8.5.1\lucene\queries\src\java\org\apache\lucene\queries\intervals\ConjunctionDISI.java

at line 68:

 
 64  private ConjunctionDISI(List iterators) {
 65    assert iterators.size() >= 2;
 66    // Sort the array the first time to allow the least frequent DocsEnum to
 67    // lead the matching.
 68   {color:#ff} CollectionUtil.timSort(iterators, 
Comparator.comparingLong(DocIdSetIterator::cost));{color}
 69    lead1 = iterators.get(0);
 70    lead2 = iterators.get(1);
 71    others = iterators.subList(2, iterators.size()).toArray(new 
DocIdSetIterator[0]);
 72  }
  

 

and then, i setup the JDK 13, compile is uccess:

ant compile


> can not compile with jdk1.8.0_20
> 
>
> Key: SOLR-14513
> URL: https://issues.apache.org/jira/browse/SOLR-14513
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Build
>Affects Versions: 8.5.1
>Reporter: aiqunfang
>Priority: Major
>
> while use jdk1.8.0_20 to compile by following command:
> ant compile
>  
> the process is abort by compile error while compile the file:
> solr-8.5.1\lucene\queries\src\java\org\apache\lucene\queries\intervals\ConjunctionDISI.java
> at line 68:
>  
>  64  private ConjunctionDISI(List iterators) {
>  65    assert iterators.size() >= 2;
>  66    // Sort the array the first time to allow the least frequent DocsEnum 
> to
>  67    // lead the matching.
>  68   {color:#ff} CollectionUtil.timSort(iterators, 
> Comparator.comparingLong(DocIdSetIterator::cost));{color}
>  69    lead1 = iterators.get(0);
>  70    lead2 = iterators.get(1);
>  71    others = iterators.subList(2, iterators.size()).toArray(new 
> DocIdSetIterator[0]);
>  72  }
>   
>  
> and then, i setup the JDK 13, compile is success:
> ant compile



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9378) Configurable compression for BinaryDocValues

2020-05-26 Thread Adrien Grand (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116518#comment-17116518
 ] 

Adrien Grand commented on LUCENE-9378:
--

I profiled some of these sorting tasks to understand where time is spent, and 
while there is non-negligible time spent reading lengths, the bulk of the CPU 
time is spent decompressing bytes given how highly compressible titles are in 
wikimedium with lots of exact duplicates. Furthermore, in your case, decoding 
the lengths is likely even cheaper given that all documents have the same 
length.

We have discussed building dictionaries in the past, in order to not have to 
decompress all values when we need a single one in a block. This was initially 
for stored fields, but I believe that this could help in this case here. It's 
not really a low hanging fruit though. Would it be an ok workaround for you if, 
in the meantime, we just disabled compression for short binary values? E.g. 
assuming that the average length of values in a block is small enough, then 
we'd write values without compression. This would help avoid introducing a flag.

> Configurable compression for BinaryDocValues
> 
>
> Key: LUCENE-9378
> URL: https://issues.apache.org/jira/browse/LUCENE-9378
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Viral Gandhi
>Priority: Minor
>
> Lucene 8.5.1 includes a change to always [compress 
> BinaryDocValues|https://issues.apache.org/jira/browse/LUCENE-9211]. This 
> caused (~30%) reduction in our red-line QPS (throughput). 
> We think users should be given some way to opt-in for this compression 
> feature instead of always being enabled which can have a substantial query 
> time cost as we saw during our upgrade. [~mikemccand] suggested one possible 
> approach by introducing a *mode* in Lucene80DocValuesFormat (COMPRESSED and 
> UNCOMPRESSED) and allowing users to create a custom Codec subclassing the 
> default Codec and pick the format they want.
> Idea is similar to Lucene50StoredFieldsFormat which has two modes, 
> Mode.BEST_SPEED and Mode.BEST_COMPRESSION.
> Here's related issues for adding benchmark covering BINARY doc values 
> query-time performance - [https://github.com/mikemccand/luceneutil/issues/61]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (SOLR-14513) can not compile with jdk1.8.0_20

2020-05-26 Thread Jira



 [ 
https://issues.apache.org/jira/browse/SOLR-14513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl resolved SOLR-14513.

Resolution: Cannot Reproduce

Hi,

There is no issue with compiling Lucene 8.5 with Java 8.

Please use this Jira only for confirmed bugs in the software. I'm closing this 
issue now and ask you to use the mailing list solr-user (see 
[https://lucene.apache.org/solr/community.html#mailing-lists-irc)] to ask 
further questions about this. When you do, please include more details, such as 
more about your environment, your OS, how you obtained Solr source code and 
exactly the error message you see. Also, you should use newest Java8 JDK.

Please do not reply to this Jira but use the mailing lists.

> can not compile with jdk1.8.0_20
> 
>
> Key: SOLR-14513
> URL: https://issues.apache.org/jira/browse/SOLR-14513
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Build
>Affects Versions: 8.5.1
>Reporter: aiqunfang
>Priority: Major
>
> while use jdk1.8.0_20 to compile by following command:
> ant compile
>  
> the process is abort by compile error while compile the file:
> solr-8.5.1\lucene\queries\src\java\org\apache\lucene\queries\intervals\ConjunctionDISI.java
> at line 68:
>  
>  64  private ConjunctionDISI(List iterators) {
>  65    assert iterators.size() >= 2;
>  66    // Sort the array the first time to allow the least frequent DocsEnum 
> to
>  67    // lead the matching.
>  68   {color:#ff} CollectionUtil.timSort(iterators, 
> Comparator.comparingLong(DocIdSetIterator::cost));{color}
>  69    lead1 = iterators.get(0);
>  70    lead2 = iterators.get(1);
>  71    others = iterators.subList(2, iterators.size()).toArray(new 
> DocIdSetIterator[0]);
>  72  }
>   
>  
> and then, i setup the JDK 13, compile is success:
> ant compile



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9381) Extract necessary SortField methods into a new interface

2020-05-26 Thread Alan Woodward (Jira)

Alan Woodward created LUCENE-9381:
-

 Summary: Extract necessary SortField methods into a new interface
 Key: LUCENE-9381
 URL: https://issues.apache.org/jira/browse/LUCENE-9381
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Assignee: Alan Woodward


Step 2 of LUCENE-9326.  SortField has a bunch of cruft on it that makes 
creating new sorts overly complicated.  This ticket will extract a new 
SortOrder interface from SortField that only contains the methods necessary for 
implementing a sort.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9381) Extract necessary SortField methods into a new interface

2020-05-26 Thread Alan Woodward (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116592#comment-17116592
 ] 

Alan Woodward commented on LUCENE-9381:
---

I opened a PR: https://github.com/apache/lucene-solr/pull/1537

Some comments:
* we still have a `getReverse()` method on SortOrder, but I'd like to remove 
that in a follow up - sort orders should be able to cope with 
ascending/descending sorts within their comparator functions.
* this makes Sort itself shallowly immutable, and as we move various sorts to 
implement SortOrder rather than extending SortField we can make them immutable 
as well.  It also removes a whole bunch of outdated javadoc on Sort, including 
my favourite doc line "you can re-use a Sort by changing its sort fields/this 
object is thread-safe"
* SortOrder has a name() method which is separate from its toString() 
implementation.  This is for distributed sorts to use as a key - toString() 
contains information about the source field, but also asc/desc and missing 
values, which you probably don't want when displaying sort values in results.

> Extract necessary SortField methods into a new interface
> 
>
> Key: LUCENE-9381
> URL: https://issues.apache.org/jira/browse/LUCENE-9381
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Alan Woodward
>Assignee: Alan Woodward
>Priority: Major
>
> Step 2 of LUCENE-9326.  SortField has a bunch of cruft on it that makes 
> creating new sorts overly complicated.  This ticket will extract a new 
> SortOrder interface from SortField that only contains the methods necessary 
> for implementing a sort.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9382) Lucene's gradle version can't cope with Java 14

2020-05-26 Thread Alan Woodward (Jira)

Alan Woodward created LUCENE-9382:
-

 Summary: Lucene's gradle version can't cope with Java 14
 Key: LUCENE-9382
 URL: https://issues.apache.org/jira/browse/LUCENE-9382
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward


If you have JDK 14 installed as your default java, then attempting to use 
gradle within the lucene-solr project can result in errors, particularly if you 
have other projects that use more recent gradle versions on the same machine.

```
java.lang.NoClassDefFoundError: Could not initialize class 
org.codehaus.groovy.vmplugin.v7.Java7
at 
org.codehaus.groovy.vmplugin.VMPluginFactory.(VMPluginFactory.java:43)
at 
org.codehaus.groovy.reflection.GroovyClassValueFactory.(GroovyClassValueFactory.java:35)
```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14498) BlockCache gets stuck not accepting new stores

2020-05-26 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116624#comment-17116624
 ] 

ASF subversion and git services commented on SOLR-14498:


Commit 22044fcabb84b17909a90abe9b411ab838f01d26 in lucene-solr's branch 
refs/heads/master from Andrzej Bialecki
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=22044fc ]

SOLR-14498: Upgrade to Caffeine 2.8.4, which fixes the cache poisoning issue.


> BlockCache gets stuck not accepting new stores
> --
>
> Key: SOLR-14498
> URL: https://issues.apache.org/jira/browse/SOLR-14498
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query
>Affects Versions: 6.5, 6.6.5, master (9.0), 7.7.3, 8.5.1
>Reporter: Jakub Zytka
>Assignee: Andrzej Bialecki
>Priority: Major
>
> {{BlockCache}} uses two components: "storage", i.e. {{banks}} and "eviction 
> mechanism", i.e {{cache}}, implemented by caffeine cache.
> The relation between them is that "storage" enforces a strict limit for the 
> number of entries (
> {{numberOfBlocksPerBank * numberOfBanks}}) whereas the "eviction mechanism" 
> takes care of freeing entries from the storage thanks to {{maximumSize}} set 
> for the caffeine cache to {{numberOfBlocksPerBank * numberOfBanks - 1}}.
> The storage relies on caffeine cache to eventually free at least 1 entry from 
> the storage. If that doesn't happen the {{BlockCache}} starts to fail all new 
> stores.
> As it turns out, the caffeine cache may not reduce it's size to the desired 
> {{maximumSize}} for as long as no {{put}} or {{getIfPresent}} which *finds an 
> entry* is executed.
> With a sufficiently unlucky read pattern, the block cache may be rendered 
> useless (0 hit ratio):
> cache poisoned by non-reusable entries; new, reusable entries are not stored 
> and thus not reused.
> Further info may be found in 
> [https://github.com/ben-manes/caffeine/issues/420]
>  
> Change in caffeine that triggers it's internal cleanup mechanism regardless 
> of whether getIfPresent gets a hit has been implemented in 
> [https://github.com/ben-manes/caffeine/commit/7239bb0dda2af1e7301e8f66a5df28215b5173bc]
> and is due to be released in caffeine 2.8.4



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14498) BlockCache gets stuck not accepting new stores

2020-05-26 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116668#comment-17116668
 ] 

ASF subversion and git services commented on SOLR-14498:


Commit d8542354072c05a060aa542f6107bcf6eee7ba4a in lucene-solr's branch 
refs/heads/branch_8x from Andrzej Bialecki
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d854235 ]

SOLR-14498: Upgrade to Caffeine 2.8.4, which fixes the cache poisoning issue.


> BlockCache gets stuck not accepting new stores
> --
>
> Key: SOLR-14498
> URL: https://issues.apache.org/jira/browse/SOLR-14498
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query
>Affects Versions: 6.5, 6.6.5, master (9.0), 7.7.3, 8.5.1
>Reporter: Jakub Zytka
>Assignee: Andrzej Bialecki
>Priority: Major
>
> {{BlockCache}} uses two components: "storage", i.e. {{banks}} and "eviction 
> mechanism", i.e {{cache}}, implemented by caffeine cache.
> The relation between them is that "storage" enforces a strict limit for the 
> number of entries (
> {{numberOfBlocksPerBank * numberOfBanks}}) whereas the "eviction mechanism" 
> takes care of freeing entries from the storage thanks to {{maximumSize}} set 
> for the caffeine cache to {{numberOfBlocksPerBank * numberOfBanks - 1}}.
> The storage relies on caffeine cache to eventually free at least 1 entry from 
> the storage. If that doesn't happen the {{BlockCache}} starts to fail all new 
> stores.
> As it turns out, the caffeine cache may not reduce it's size to the desired 
> {{maximumSize}} for as long as no {{put}} or {{getIfPresent}} which *finds an 
> entry* is executed.
> With a sufficiently unlucky read pattern, the block cache may be rendered 
> useless (0 hit ratio):
> cache poisoned by non-reusable entries; new, reusable entries are not stored 
> and thus not reused.
> Further info may be found in 
> [https://github.com/ben-manes/caffeine/issues/420]
>  
> Change in caffeine that triggers it's internal cleanup mechanism regardless 
> of whether getIfPresent gets a hit has been implemented in 
> [https://github.com/ben-manes/caffeine/commit/7239bb0dda2af1e7301e8f66a5df28215b5173bc]
> and is due to be released in caffeine 2.8.4



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (SOLR-14498) BlockCache gets stuck not accepting new stores

2020-05-26 Thread Andrzej Bialecki (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki resolved SOLR-14498.
-
Fix Version/s: 8.6
   Resolution: Fixed

Thanks [~jakubzytka] for reporting this and for your analysis.

> BlockCache gets stuck not accepting new stores
> --
>
> Key: SOLR-14498
> URL: https://issues.apache.org/jira/browse/SOLR-14498
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query
>Affects Versions: 6.5, 6.6.5, master (9.0), 7.7.3, 8.5.1
>Reporter: Jakub Zytka
>Assignee: Andrzej Bialecki
>Priority: Major
> Fix For: 8.6
>
>
> {{BlockCache}} uses two components: "storage", i.e. {{banks}} and "eviction 
> mechanism", i.e {{cache}}, implemented by caffeine cache.
> The relation between them is that "storage" enforces a strict limit for the 
> number of entries (
> {{numberOfBlocksPerBank * numberOfBanks}}) whereas the "eviction mechanism" 
> takes care of freeing entries from the storage thanks to {{maximumSize}} set 
> for the caffeine cache to {{numberOfBlocksPerBank * numberOfBanks - 1}}.
> The storage relies on caffeine cache to eventually free at least 1 entry from 
> the storage. If that doesn't happen the {{BlockCache}} starts to fail all new 
> stores.
> As it turns out, the caffeine cache may not reduce it's size to the desired 
> {{maximumSize}} for as long as no {{put}} or {{getIfPresent}} which *finds an 
> entry* is executed.
> With a sufficiently unlucky read pattern, the block cache may be rendered 
> useless (0 hit ratio):
> cache poisoned by non-reusable entries; new, reusable entries are not stored 
> and thus not reused.
> Further info may be found in 
> [https://github.com/ben-manes/caffeine/issues/420]
>  
> Change in caffeine that triggers it's internal cleanup mechanism regardless 
> of whether getIfPresent gets a hit has been implemented in 
> [https://github.com/ben-manes/caffeine/commit/7239bb0dda2af1e7301e8f66a5df28215b5173bc]
> and is due to be released in caffeine 2.8.4



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9325) Sort and SortField are not immutable

2020-05-26 Thread Michael McCandless (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116679#comment-17116679
 ] 

Michael McCandless commented on LUCENE-9325:


+1, clearly trappy today, and we (Amazon product search) fell into the trap and 
were very confused!!

> Sort and SortField are not immutable
> 
>
> Key: LUCENE-9325
> URL: https://issues.apache.org/jira/browse/LUCENE-9325
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Alan Woodward
>Assignee: Alan Woodward
>Priority: Major
>
> The `Sort` and `SortField` classes are currently mutable, which makes them 
> dangerous to use in multiple threads.  In particular, you can set an index 
> sort on an IndexWriterConfig and then change its internal sort fields while 
> the index is being written to.
> We should make all member fields on these classes final, and in addition we 
> should make `Sort` final itself.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9382) Lucene's gradle version can't cope with Java 14

2020-05-26 Thread Erick Erickson (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116683#comment-17116683
 ] 

Erick Erickson commented on LUCENE-9382:


Looks like https://github.com/gradle/gradle/issues/10248?

Should be fixed if we upgrade Gradle to 6.3 or later: 
https://docs.gradle.org/6.3/release-notes.html#support-for-java-14

> Lucene's gradle version can't cope with Java 14
> ---
>
> Key: LUCENE-9382
> URL: https://issues.apache.org/jira/browse/LUCENE-9382
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Alan Woodward
>Priority: Major
>
> If you have JDK 14 installed as your default java, then attempting to use 
> gradle within the lucene-solr project can result in errors, particularly if 
> you have other projects that use more recent gradle versions on the same 
> machine.
> ```
> java.lang.NoClassDefFoundError: Could not initialize class 
> org.codehaus.groovy.vmplugin.v7.Java7
> at 
> org.codehaus.groovy.vmplugin.VMPluginFactory.(VMPluginFactory.java:43)
> at 
> org.codehaus.groovy.reflection.GroovyClassValueFactory.(GroovyClassValueFactory.java:35)
> ```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14280) SolrConfig logging not helpful

2020-05-26 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116700#comment-17116700
 ] 

ASF subversion and git services commented on SOLR-14280:


Commit 46ca7686875243a09aecfbfb49b05f457ad68751 in lucene-solr's branch 
refs/heads/master from Jason Gerlowski
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=46ca768 ]

SOLR-14280: SolrConfig error handling improvements


> SolrConfig logging not helpful
> --
>
> Key: SOLR-14280
> URL: https://issues.apache.org/jira/browse/SOLR-14280
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andras Salamon
>Priority: Minor
> Attachments: SOLR-14280-01.patch, SOLR-14280-02.patch, getmessages.txt
>
>
> SolrConfig prints out a warning message if it's not able to add files to the 
> classpath, but this message is not too helpful:
> {noformat}
> o.a.s.c.SolrConfig Couldn't add files from 
> /opt/cloudera/parcels/CDH-7.1.1-1.cdh7.1.1.p0.1850855/lib/solr/dist filtered 
> by solr-langid-\d.*\.jar to classpath: 
> /opt/cloudera/parcels/CDH-7.1.1-1.cdh7.1.1.p0.1850855/lib/solr/
> dist {noformat}
> The reason should be at the end of the log message but it's just repeats the 
> problematic file name.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9378) Configurable compression for BinaryDocValues

2020-05-26 Thread Michael Sokolov (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116713#comment-17116713
 ] 

Michael Sokolov commented on LUCENE-9378:
-

{quote}... disabled compression for short binary values ...
{quote}
That seems like a good compromise. It should preserve the main benefit of 
compression. Our main use case is 10-character values, although we have a 
related field that could be larger - a threshold somewhere in 32-64 bytes makes 
sense to me. That would capture eg UUIDs even with their hyphens.

> Configurable compression for BinaryDocValues
> 
>
> Key: LUCENE-9378
> URL: https://issues.apache.org/jira/browse/LUCENE-9378
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Viral Gandhi
>Priority: Minor
>
> Lucene 8.5.1 includes a change to always [compress 
> BinaryDocValues|https://issues.apache.org/jira/browse/LUCENE-9211]. This 
> caused (~30%) reduction in our red-line QPS (throughput). 
> We think users should be given some way to opt-in for this compression 
> feature instead of always being enabled which can have a substantial query 
> time cost as we saw during our upgrade. [~mikemccand] suggested one possible 
> approach by introducing a *mode* in Lucene80DocValuesFormat (COMPRESSED and 
> UNCOMPRESSED) and allowing users to create a custom Codec subclassing the 
> default Codec and pick the format they want.
> Idea is similar to Lucene50StoredFieldsFormat which has two modes, 
> Mode.BEST_SPEED and Mode.BEST_COMPRESSION.
> Here's related issues for adding benchmark covering BINARY doc values 
> query-time performance - [https://github.com/mikemccand/luceneutil/issues/61]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9378) Configurable compression for BinaryDocValues

2020-05-26 Thread Michael Sokolov (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116719#comment-17116719
 ] 

Michael Sokolov commented on LUCENE-9378:
-

Another use case I'd like to plan for is using {{BinaryDocValues}} to store 
numeric vectors and retrieve them for time-sensitive computations (think 64-256 
dimension floating point numbers, or maybe bytes, so something in the range of 
64-1024 bytes per document). Probably these would be somewhat compressible if a 
number of the dimensions take on zero values? We can revisit later if/when this 
use case becomes problematic, but I think it could require some manual control 
to preserve best query performance.

> Configurable compression for BinaryDocValues
> 
>
> Key: LUCENE-9378
> URL: https://issues.apache.org/jira/browse/LUCENE-9378
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Viral Gandhi
>Priority: Minor
>
> Lucene 8.5.1 includes a change to always [compress 
> BinaryDocValues|https://issues.apache.org/jira/browse/LUCENE-9211]. This 
> caused (~30%) reduction in our red-line QPS (throughput). 
> We think users should be given some way to opt-in for this compression 
> feature instead of always being enabled which can have a substantial query 
> time cost as we saw during our upgrade. [~mikemccand] suggested one possible 
> approach by introducing a *mode* in Lucene80DocValuesFormat (COMPRESSED and 
> UNCOMPRESSED) and allowing users to create a custom Codec subclassing the 
> default Codec and pick the format they want.
> Idea is similar to Lucene50StoredFieldsFormat which has two modes, 
> Mode.BEST_SPEED and Mode.BEST_COMPRESSION.
> Here's related issues for adding benchmark covering BINARY doc values 
> query-time performance - [https://github.com/mikemccand/luceneutil/issues/61]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14280) SolrConfig logging not helpful

2020-05-26 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116720#comment-17116720
 ] 

ASF subversion and git services commented on SOLR-14280:


Commit 96e3c25d4432a36f4a49afe17542797c2db690d4 in lucene-solr's branch 
refs/heads/branch_8x from Jason Gerlowski
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=96e3c25 ]

SOLR-14280: SolrConfig error handling improvements


> SolrConfig logging not helpful
> --
>
> Key: SOLR-14280
> URL: https://issues.apache.org/jira/browse/SOLR-14280
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andras Salamon
>Priority: Minor
> Attachments: SOLR-14280-01.patch, SOLR-14280-02.patch, getmessages.txt
>
>
> SolrConfig prints out a warning message if it's not able to add files to the 
> classpath, but this message is not too helpful:
> {noformat}
> o.a.s.c.SolrConfig Couldn't add files from 
> /opt/cloudera/parcels/CDH-7.1.1-1.cdh7.1.1.p0.1850855/lib/solr/dist filtered 
> by solr-langid-\d.*\.jar to classpath: 
> /opt/cloudera/parcels/CDH-7.1.1-1.cdh7.1.1.p0.1850855/lib/solr/
> dist {noformat}
> The reason should be at the end of the log message but it's just repeats the 
> problematic file name.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14280) SolrConfig logging not helpful

2020-05-26 Thread Jason Gerlowski (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gerlowski updated SOLR-14280:
---
Fix Version/s: 8.6
   master (9.0)
 Assignee: Jason Gerlowski
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Thanks for the improvements Andras.  Committed this to master and branch_8x, so 
we'll see it starting in 8.6.

FYI - no need to give your patches "version" suffixes ({{-02.patch}}, 
{{-01.patch}}, etc.)  JIRA already handles this pretty well -  attachments with 
the same name are displayed in descending order, and all but the newest version 
are "greyed" out to make it clear they're less-recent.  (I don't mind the 
suffixes, but some do, so just wanted to give you a heads up.)

> SolrConfig logging not helpful
> --
>
> Key: SOLR-14280
> URL: https://issues.apache.org/jira/browse/SOLR-14280
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andras Salamon
>Assignee: Jason Gerlowski
>Priority: Minor
> Fix For: master (9.0), 8.6
>
> Attachments: SOLR-14280-01.patch, SOLR-14280-02.patch, getmessages.txt
>
>
> SolrConfig prints out a warning message if it's not able to add files to the 
> classpath, but this message is not too helpful:
> {noformat}
> o.a.s.c.SolrConfig Couldn't add files from 
> /opt/cloudera/parcels/CDH-7.1.1-1.cdh7.1.1.p0.1850855/lib/solr/dist filtered 
> by solr-langid-\d.*\.jar to classpath: 
> /opt/cloudera/parcels/CDH-7.1.1-1.cdh7.1.1.p0.1850855/lib/solr/
> dist {noformat}
> The reason should be at the end of the log message but it's just repeats the 
> problematic file name.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9383) Port benchmark module Ant build to Gradle

2020-05-26 Thread David Smiley (Jira)

David Smiley created LUCENE-9383:


 Summary: Port benchmark module Ant build to Gradle
 Key: LUCENE-9383
 URL: https://issues.apache.org/jira/browse/LUCENE-9383
 Project: Lucene - Core
  Issue Type: Sub-task
  Components: modules/benchmark
Reporter: David Smiley
Assignee: David Smiley


The benchmark's build is more than a conventional build, it also has scripts to 
fetch sample data for perf testing, and a task to run a perf test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14467) inconsistent server errors combining relatedness() with allBuckets:true

2020-05-26 Thread Chris M. Hostetter (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116880#comment-17116880
 ] 

Chris M. Hostetter commented on SOLR-14467:
---


bq. One general question: when SlotContext.isAllBucket()==true, what would be 
returned by SlotContext.getSlotQuery()?

I would suggest that the javadoc for {{getQuery()}} should say that it's 
behavior is undefined if {{isAllBucket()}} is true -- from an implementation 
standpoint i think throwing {{IllegalStateException}} is is going to be the 
best/fastest way to catch bugs.

bq. FWIW I thought a little more about why I proceeded under the assumption 
that relatedness() would be meaningful for allBuckets. I actually do think it 
could be relevant, but in a way that (upon further reflection) I think can only 
be practically calculated using sweep collection. ...

Let's file a new "Improvement" jira to re-consider this topic down the road, 
cross link back to this issue for the existing context/discussion of semantic 
meaning & implementation ideas, and make sure that the changes we make to 
SlotAcc & SKGAcc for this bug fix have 'TODO' comments linking to the new issue 
as possible improvements.


bq. Oh, and in light of this conversation, I'm guessing we should probably 
force-push/overwrite all of my most recent ...

Well, that's entirely up to you -- it's your repo, and doesn't affect me much 
(both because i odn't have any significant unpushed changes, and because i'm 
always very cautions about preserving work in patch files and stashes).  

If you're asking my opinion: I personally think force-push is the single 
greatest failure in the design of GIT, and if i was in charge at github I would 
completely disable it for all repos and tell people who want to use it to find 
hosting elsewhere ... it's the antithesis of collab development (even if 
there's only one person with rights to "push" it can cause problems for anyone 
who has the ability to "fetch" and depends on the repo).

If your motivation is to try and keep the history on the PR branch clean, don't 
worry about it -- i'm not sure how PR normally get "applied" if you use the 
gitub UI, but i'll ultimately do it manually either via squash merge or by 
actually generating a new patch file (I mentioned before: I'm old school)


> inconsistent server errors combining relatedness() with allBuckets:true
> ---
>
> Key: SOLR-14467
> URL: https://issues.apache.org/jira/browse/SOLR-14467
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-14467.patch, SOLR-14467_test.patch
>
>
> While working on randomized testing for SOLR-13132 i discovered a variety of 
> different ways that JSON Faceting's "allBuckets" option can fail when 
> combined with the "relatedness()" function.
> I haven't found a trivial way to manual reproduce this, but i have been able 
> to trigger the failures with a trivial patch to {{TestCloudJSONFacetSKG}} 
> which i will attach.
> Based on the nature of the failures it looks like it may have something to do 
> with multiple segments of different sizes, and or resizing the SlotAccs ?
> The relatedness() function doesn't have much (any?) existing tests in place 
> that leverage "allBuckets" so this is probably a bug that has always existed 
> -- it's possible it may be excessively cumbersome to fix and we might 
> nee/wnat to just document that incompatibility and add some code to try and 
> detect if the user combines these options and if so fail with a 400 error?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-05-26 Thread Tomas Eduardo Fernandez Lobbe (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116924#comment-17116924
 ] 

Tomas Eduardo Fernandez Lobbe commented on SOLR-13289:
--

I like the idea Christine, thanks! Lets take it in a followup Jira issue, no 
need to keep adding to this one I think. I already created a couple of followup 
tasks, maybe also link yours?

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Fix For: master (9.0), 8.6
>
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9326) Refactor SortField to better handle extensions

2020-05-26 Thread Tony Xu (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116928#comment-17116928
 ] 

Tony Xu commented on LUCENE-9326:
-

Hi Alan,

I landed here because of this PR 
[https://github.com/apache/lucene-solr/pull/1537/files]

It reminds me an issue[1] that I reported while working on our in-house 
application that manages sorting ourselves. This means we want to only read the 
values. However, reading the values of a SortField needs to go through 
FieldComparator which in most of the cases maintain a priority queue (allocate 
storage). 

 

Maybe this is not directly relevant to what you're trying to solve but linking 
it here for awareness.

[1]https://issues.apache.org/jira/browse/LUCENE-8878

> Refactor SortField to better handle extensions
> --
>
> Key: LUCENE-9326
> URL: https://issues.apache.org/jira/browse/LUCENE-9326
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Alan Woodward
>Assignee: Alan Woodward
>Priority: Major
>
> Working on LUCENE-9325 has made me realize that SortField needs some serious 
> reworking:
> * we have a bunch of hard-coded types, but also a number of custom 
> extensions, which make implementing new sort orders complicated in 
> non-obvious ways
> * we refer to these hard-coded types in a number of places, in particular in 
> index sorts, which means that you can't use a 'custom' sort here.  For 
> example, I can see it would be very useful to be able to index sort by 
> distance from a particular point, but that's not currently possible.
> * the API separates out the comparator and whether or not it should be 
> reversed, which adds an extra layer of complication to its use, particularly 
> in cases where we have multiple sortfields.
> The whole thing could do with an overhaul.  I think this can be broken up 
> into a few stages by adding a new superclass abstraction which `SortField` 
> will extend, and gradually moving functionality into this superclass.  I plan 
> on starting with index sorting, which will require a sort field to a) be able 
> to merge sort documents coming from a list of readers, and b) serialize 
> itself to and deserialize itself from SegmentInfo



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9384) Backport for the field sort optimization of LUCENE-9280

2020-05-26 Thread Mayya Sharipova (Jira)

Mayya Sharipova created LUCENE-9384:
---

 Summary: Backport for the field sort optimization of LUCENE-9280
 Key: LUCENE-9384
 URL: https://issues.apache.org/jira/browse/LUCENE-9384
 Project: Lucene - Core
  Issue Type: Task
Affects Versions: 8.x
Reporter: Mayya Sharipova


Field sort optimization implemented in LUCENE-9280 is based on an assumption 
that if a numeric field is indexed with both doc values and points, the *same 
data* is stored in these points and doc values.  While there is  a plan in 
LUCENE-9334 to enforce this consistency from Lucene 9.0, there is nothing in 
Lucene 8.x to enforce this assumption.

 

Thus in order to backport the sort optimization to 8.x, we need to make a user 
to explicitly opt in for it. This could be done by either:
 * introducing a special SortField (e.g. DocValuesPointSortField) that will use 
optimization
 * introduce a bool parameter to a SortField which when true will indicate that 
the sort optimization should be enabled (e.g. SortField("my_field", 
SortField.Type.LONG, true))

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13749) Implement support for joining across collections with multiple shards ( XCJF )

2020-05-26 Thread Kevin Watters (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117018#comment-17117018
 ] 

Kevin Watters commented on SOLR-13749:
--

routerField in the solrconfig.xml is only to be able to choose an intellegent 
value for the "routed" parameter.  You can pass "routed=true" as a local param 
in the query request.  At that point, the routerField is irrelevant.  

> Implement support for joining across collections with multiple shards ( XCJF )
> --
>
> Key: SOLR-13749
> URL: https://issues.apache.org/jira/browse/SOLR-13749
> Project: Solr
>  Issue Type: New Feature
>Reporter: Kevin Watters
>Assignee: Gus Heck
>Priority: Blocker
> Fix For: 8.6
>
> Attachments: 2020-03 Smiley with ASF hat.jpeg
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> This ticket includes 2 query parsers.
> The first one is the "Cross collection join filter"  (XCJF) parser. This is 
> the "Cross-collection join filter" query parser. It can do a call out to a 
> remote collection to get a set of join keys to be used as a filter against 
> the local collection.
> The second one is the Hash Range query parser that you can specify a field 
> name and a hash range, the result is that only the documents that would have 
> hashed to that range will be returned.
> This query parser will do an intersection based on join keys between 2 
> collections.
> The local collection is the collection that you are searching against.
> The remote collection is the collection that contains the join keys that you 
> want to use as a filter.
> Each shard participating in the distributed request will execute a query 
> against the remote collection.  If the local collection is setup with the 
> compositeId router to be routed on the join key field, a hash range query is 
> applied to the remote collection query to only match the documents that 
> contain a potential match for the documents that are in the local shard/core. 
>  
>  
> Here's some vocab to help with the descriptions of the various parameters.
> ||Term||Description||
> |Local Collection|This is the main collection that is being queried.|
> |Remote Collection|This is the collection that the XCJFQuery will query to 
> resolve the join keys.|
> |XCJFQuery|The lucene query that executes a search to get back a set of join 
> keys from a remote collection|
> |HashRangeQuery|The lucene query that matches only the documents whose hash 
> code on a field falls within a specified range.|
>  
>  
> ||Param ||Required ||Description||
> |collection|Required|The name of the external Solr collection to be queried 
> to retrieve the set of join key values ( required )|
> |zkHost|Optional|The connection string to be used to connect to Zookeeper.  
> zkHost and solrUrl are both optional parameters, and at most one of them 
> should be specified.  
> If neither of zkHost or solrUrl are specified, the local Zookeeper cluster 
> will be used. ( optional )|
> |solrUrl|Optional|The URL of the external Solr node to be queried ( optional 
> )|
> |from|Required|The join key field name in the external collection ( required 
> )|
> |to|Required|The join key field name in the local collection|
> |v|See Note|The query to be executed against the external Solr collection to 
> retrieve the set of join key values.  
> Note:  The original query can be passed at the end of the string or as the 
> "v" parameter.  
> It's recommended to use query parameter substitution with the "v" parameter 
> to ensure no issues arise with the default query parsers.|
> |routed| |true / false.  If true, the XCJF query will use each shard's hash 
> range to determine the set of join keys to retrieve for that shard.
> This parameter improves the performance of the cross-collection join, but 
> it depends on the local collection being routed by the toField.  If this 
> parameter is not specified, 
> the XCJF query will try to determine the correct value automatically.|
> |ttl| |The length of time that an XCJF query in the cache will be considered 
> valid, in seconds.  Defaults to 3600 (one hour).  
> The XCJF query will not be aware of changes to the remote collection, so 
> if the remote collection is updated, cached XCJF queries may give inaccurate 
> results.  
> After the ttl period has expired, the XCJF query will re-execute the join 
> against the remote collection.|
> |_All others_| |Any normal Solr parameter can also be specified as a local 
> param.|
>  
> Example Solr Config.xml changes:
>  
>  {{<}}{{cache}} {{name}}{{=}}{{"hash_vin"}}
>  {{   }}{{class}}{{=}}{{"solr.LRUCache"}}
>  {{   }}{{size}}{{=}}{{"128"}}
>  {{   }}{{initialSize}}{{=}}{{"0"}}
>  {{   }}{{regenerator}}{{=}}{{"solr.NoOpRegenerator"}}{{/>}}
>

[jira] [Commented] (LUCENE-9382) Lucene's gradle version can't cope with Java 14

2020-05-26 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117022#comment-17117022
 ] 

Dawid Weiss commented on LUCENE-9382:
-

Upgrading to newer gradle isn't always trivial (in fact, it almost never 
is...). 

> Lucene's gradle version can't cope with Java 14
> ---
>
> Key: LUCENE-9382
> URL: https://issues.apache.org/jira/browse/LUCENE-9382
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Alan Woodward
>Priority: Major
>
> If you have JDK 14 installed as your default java, then attempting to use 
> gradle within the lucene-solr project can result in errors, particularly if 
> you have other projects that use more recent gradle versions on the same 
> machine.
> ```
> java.lang.NoClassDefFoundError: Could not initialize class 
> org.codehaus.groovy.vmplugin.v7.Java7
> at 
> org.codehaus.groovy.vmplugin.VMPluginFactory.(VMPluginFactory.java:43)
> at 
> org.codehaus.groovy.reflection.GroovyClassValueFactory.(GroovyClassValueFactory.java:35)
> ```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Assigned] (LUCENE-9382) Lucene's gradle version can't cope with Java 14

2020-05-26 Thread Dawid Weiss (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss reassigned LUCENE-9382:
---

Assignee: Dawid Weiss

> Lucene's gradle version can't cope with Java 14
> ---
>
> Key: LUCENE-9382
> URL: https://issues.apache.org/jira/browse/LUCENE-9382
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Alan Woodward
>Assignee: Dawid Weiss
>Priority: Major
>
> If you have JDK 14 installed as your default java, then attempting to use 
> gradle within the lucene-solr project can result in errors, particularly if 
> you have other projects that use more recent gradle versions on the same 
> machine.
> ```
> java.lang.NoClassDefFoundError: Could not initialize class 
> org.codehaus.groovy.vmplugin.v7.Java7
> at 
> org.codehaus.groovy.vmplugin.VMPluginFactory.(VMPluginFactory.java:43)
> at 
> org.codehaus.groovy.reflection.GroovyClassValueFactory.(GroovyClassValueFactory.java:35)
> ```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-9382) Lucene's gradle version can't cope with Java 14

2020-05-26 Thread Dawid Weiss (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated LUCENE-9382:

Parent: LUCENE-9077
Issue Type: Sub-task  (was: Improvement)

> Lucene's gradle version can't cope with Java 14
> ---
>
> Key: LUCENE-9382
> URL: https://issues.apache.org/jira/browse/LUCENE-9382
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Alan Woodward
>Assignee: Dawid Weiss
>Priority: Major
>
> If you have JDK 14 installed as your default java, then attempting to use 
> gradle within the lucene-solr project can result in errors, particularly if 
> you have other projects that use more recent gradle versions on the same 
> machine.
> ```
> java.lang.NoClassDefFoundError: Could not initialize class 
> org.codehaus.groovy.vmplugin.v7.Java7
> at 
> org.codehaus.groovy.vmplugin.VMPluginFactory.(VMPluginFactory.java:43)
> at 
> org.codehaus.groovy.reflection.GroovyClassValueFactory.(GroovyClassValueFactory.java:35)
> ```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-11334) UnifiedSolrHighlighter returns an error when hl.fl delimited by ", "

2020-05-26 Thread David Smiley (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-11334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117038#comment-17117038
 ] 

David Smiley commented on SOLR-11334:
-

Thanks but the patch doesn't fix the root cause; it works around it.  The root 
cause is clearly {{SolrPluginUtils.split}} for producing such empty strings in 
the first place.  I filed a PR accomplishing the fix.  I like your test!

> UnifiedSolrHighlighter returns an error when hl.fl delimited by ", "
> 
>
> Key: SOLR-11334
> URL: https://issues.apache.org/jira/browse/SOLR-11334
> Project: Solr
>  Issue Type: Bug
>  Components: highlighter
>Affects Versions: 6.6
> Environment: Ubuntu 17.04 (GNU/Linux 4.10.0-33-generic x86_64)
> Java HotSpot 64-Bit Server VM(build 25.114-b01, mixed mode)
>Reporter: Yasufumi Mizoguchi
>Priority: Trivial
> Attachments: SOLR-11334.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> UnifiedSolrHighlighter(hl.method=unified) misjudge the zero-length string as 
> a field name and returns an error when hl.fl delimited by ", "
> request:
> {code}
> $ curl -XGET 
> "http://localhost:8983/solr/techproducts/select?fl=name,%20manu&hl.fl=name,%20manu&hl.method=unified&hl=on&indent=on&q=corsair&wt=json";
> {code}
> response:
> {code}
> {
>   "responseHeader":{
> "status":400,
> "QTime":8,
> "params":{
>   "q":"corsair",
>   "hl":"on",
>   "indent":"on",
>   "fl":"name, manu",
>   "hl.fl":"name, manu",
>   "hl.method":"unified",
>   "wt":"json"}},
>   "response":{"numFound":2,"start":0,"docs":[
>   {
> "name":"CORSAIR ValueSelect 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 
> (PC 3200) System Memory - Retail",
> "manu":"Corsair Microsystems Inc."},
>   {
> "name":"CORSAIR  XMS 2GB (2 x 1GB) 184-Pin DDR SDRAM Unbuffered DDR 
> 400 (PC 3200) Dual Channel Kit System Memory - Retail",
> "manu":"Corsair Microsystems Inc."}]
>   },
>   "error":{
> "metadata":[
>   "error-class","org.apache.solr.common.SolrException",
>   "root-error-class","org.apache.solr.common.SolrException"],
> "msg":"undefined field ",
> "code":400}}
> {code}
> DefaultHighlighter's response:
> {code}
> {
>   "responseHeader":{
> "status":0,
> "QTime":5,
> "params":{
>   "q":"corsair",
>   "hl":"on",
>   "indent":"on",
>   "fl":"name, manu",
>   "hl.fl":"name, manu",
>   "hl.method":"original",
>   "wt":"json"}},
>   "response":{"numFound":2,"start":0,"docs":[
>   {
> "name":"CORSAIR ValueSelect 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 
> (PC 3200) System Memory - Retail",
> "manu":"Corsair Microsystems Inc."},
>   {
> "name":"CORSAIR  XMS 2GB (2 x 1GB) 184-Pin DDR SDRAM Unbuffered DDR 
> 400 (PC 3200) Dual Channel Kit System Memory - Retail",
> "manu":"Corsair Microsystems Inc."}]
>   },
>   "highlighting":{
> "VS1GB400C3":{
>   "name":["CORSAIR ValueSelect 1GB 184-Pin DDR SDRAM Unbuffered 
> DDR 400 (PC 3200) System Memory - Retail"],
>   "manu":["Corsair Microsystems Inc."]},
> "TWINX2048-3200PRO":{
>   "name":["CORSAIR  XMS 2GB (2 x 1GB) 184-Pin DDR SDRAM 
> Unbuffered DDR 400 (PC 3200) Dual Channel Kit System"],
>   "manu":["Corsair Microsystems Inc."]}}}
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9077) Gradle build

2020-05-26 Thread David Smiley (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117042#comment-17117042
 ] 

David Smiley commented on LUCENE-9077:
--

I noticed that the Gradle build cache is defeated when depending on our own 
JARs because we include a timestamp in the MANIFEST.MF 
"Implementation-Version".  Consequently we over-build JARs.  At least I see 
this as I work on migrating Ant to Gradle for the benchmark module (WIP).  At 
least Gradle upstream seems to have a fix in-progress: 
https://github.com/gradle/gradle/pull/11937  I've been unable to use the 
"normalization" mechanism to simply ignore the manifest -- 
https://docs.gradle.org/6.0/userguide/more_about_tasks.html#sec:configure_input_normalization

> Gradle build
> 
>
> Key: LUCENE-9077
> URL: https://issues.apache.org/jira/browse/LUCENE-9077
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: LUCENE-9077-javadoc-locale-en-US.patch
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> This task focuses on providing gradle-based build equivalent for Lucene and 
> Solr (on master branch). See notes below on why this respin is needed.
> The code lives on *gradle-master* branch. It is kept with sync with *master*. 
> Try running the following to see an overview of helper guides concerning 
> typical workflow, testing and ant-migration helpers:
> gradlew :help
> A list of items that needs to be added or requires work. If you'd like to 
> work on any of these, please add your name to the list. Once you have a 
> patch/ pull request let me (dweiss) know - I'll try to coordinate the merges.
>  * (/) Apply forbiddenAPIs
>  * (/) Generate hardware-aware gradle defaults for parallelism (count of 
> workers and test JVMs).
>  * (/) Fail the build if --tests filter is applied and no tests execute 
> during the entire build (this allows for an empty set of filtered tests at 
> single project level).
>  * (/) Port other settings and randomizations from common-build.xml
>  * (/) Configure security policy/ sandboxing for tests.
>  * (/) test's console output on -Ptests.verbose=true
>  * (/) add a :helpDeps explanation to how the dependency system works 
> (palantir plugin, lockfile) and how to retrieve structured information about 
> current dependencies of a given module (in a tree-like output).
>  * (/) jar checksums, jar checksum computation and validation. This should be 
> done without intermediate folders (directly on dependency sets).
>  * (/) verify min. JVM version and exact gradle version on build startup to 
> minimize odd build side-effects
>  * (/) Repro-line for failed tests/ runs.
>  * (/) add a top-level README note about building with gradle (and the 
> required JVM).
>  * (/) add an equivalent of 'validate-source-patterns' 
> (check-source-patterns.groovy) to precommit.
>  * (/) add an equivalent of 'rat-sources' to precommit.
>  * (/) add an equivalent of 'check-example-lucene-match-version' (solr only) 
> to precommit.
>  * (/) javadoc compilation
> Hard-to-implement stuff already investigated:
>  * (/) (done)  -*Printing console output of failed tests.* There doesn't seem 
> to be any way to do this in a reasonably efficient way. There are onOutput 
> listeners but they're slow to operate and solr tests emit *tons* of output so 
> it's an overkill.-
>  * (!) (LUCENE-9120) *Tests working with security-debug logs or other 
> JVM-early log output*. Gradle's test runner works by redirecting Java's 
> stdout/ syserr so this just won't work. Perhaps we can spin the ant-based 
> test runner for such corner-cases.
> Of lesser importance:
>  * Add an equivalent of 'documentation-lint" to precommit.
>  * (/) Do not require files to be committed before running precommit. (staged 
> files are fine).
>  * (/) add rendering of javadocs (gradlew javadoc)
>  * Attach javadocs to maven publications.
>  * Add test 'beasting' (rerunning the same suite multiple times). I'm afraid 
> it'll be difficult to run it sensibly because gradle doesn't offer cwd 
> separation for the forked test runners.
>  * if you diff solr packaged distribution against ant-created distribution 
> there are minor differences in library versions and some JARs are excluded/ 
> moved around. I didn't try to force these as everything seems to work (tests, 
> etc.) – perhaps these differences should  be fixed in the ant build instead.
>  * (/) identify and port various "regenerate" tasks from ant builds (javacc, 
> precompiled automata, etc.)
>  * Fill in POM details in gradle/defaults-maven.gradle so that they reflect 
> the previous content better (dependencies aside).
>  * Add any IDE integration layers that should be added (I use IntelliJ and it 
> imports th

[jira] [Created] (SOLR-14514) json.facets: method:stream is incompatible with allBuckets:true

2020-05-26 Thread Chris M. Hostetter (Jira)

Chris M. Hostetter created SOLR-14514:
-

 Summary: json.facets: method:stream is incompatible with 
allBuckets:true
 Key: SOLR-14514
 URL: https://issues.apache.org/jira/browse/SOLR-14514
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: Facet Module
Reporter: Chris M. Hostetter


{{FacetFieldProcessorByEnumTermsStream}} has never supported 
{{allBuckets:true}} but it also doesn't fail outright if {{allBuckets:true}} is 
specified -- instead the bucket is silently missing from the response.

Given how the {{method}} option is only used as a suggestion, and the actuall 
processor can change based on hueristics about the request, this means that the 
behavior of combining {{allBuckets:true}} with {{method:stream}} can vary 
depending on the other options specified -- notably the {{sort}}

{noformat}
% curl -sS -X POST http://localhost:8983/solr/techproducts/query -d 
'omitHeader=true&rows=0&q=*:*&json.facet={
  x : {
type : terms,
method: stream,
field : manu_id_s,
allBuckets : true,
limit : 2,
} }'

{
  "response":{"numFound":32,"start":0,"docs":[]
  },
  "facets":{
"count":32,
"x":{
  "allBuckets":{
"count":18},
  "buckets":[{
  "val":"corsair",
  "count":3},
{
  "val":"belkin",
  "count":2}]}}}


% curl -sS -X POST http://localhost:8983/solr/techproducts/query -d 
'omitHeader=true&rows=0&q=*:*&json.facet={
  x : {
type : terms,
method: stream,
field : manu_id_s,
allBuckets : true,
limit : 2,
sort: "index asc"
} }'

{
  "response":{"numFound":32,"start":0,"docs":[]
  },
  "facets":{
"count":32,
"x":{
  "buckets":[{
  "val":"apple",
  "count":1},
{
  "val":"asus",
  "count":1}]}}}
{noformat}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14467) inconsistent server errors combining relatedness() with allBuckets:true

2020-05-26 Thread Chris M. Hostetter (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter updated SOLR-14467:
--
Attachment: SOLR-14467_test.patch
Status: Open  (was: Open)

I've updated {{SOLR-14467_test.patch}} to now (really) test what happens if 
{{allBuckets:true}} is specified, and assert the behavior we've discussed 
(although some tweaks may be needed depending on the final output chosen).  

(The patch also includes test logic to skip trying to use allBuckets when 
STREAM parser may be used - see SOLR-14514)

[~mgibney] - did you (still) want to take a stab at the new solution we 
discussed to fix this or should i start working on it?

> inconsistent server errors combining relatedness() with allBuckets:true
> ---
>
> Key: SOLR-14467
> URL: https://issues.apache.org/jira/browse/SOLR-14467
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-14467.patch, SOLR-14467_test.patch, 
> SOLR-14467_test.patch
>
>
> While working on randomized testing for SOLR-13132 i discovered a variety of 
> different ways that JSON Faceting's "allBuckets" option can fail when 
> combined with the "relatedness()" function.
> I haven't found a trivial way to manual reproduce this, but i have been able 
> to trigger the failures with a trivial patch to {{TestCloudJSONFacetSKG}} 
> which i will attach.
> Based on the nature of the failures it looks like it may have something to do 
> with multiple segments of different sizes, and or resizing the SlotAccs ?
> The relatedness() function doesn't have much (any?) existing tests in place 
> that leverage "allBuckets" so this is probably a bug that has always existed 
> -- it's possible it may be excessively cumbersome to fix and we might 
> nee/wnat to just document that incompatibility and add some code to try and 
> detect if the user combines these options and if so fail with a 400 error?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-14516) NPE during Realtime GET

2020-05-26 Thread Noble Paul (Jira)

Noble Paul created SOLR-14516:
-

 Summary: NPE during Realtime GET
 Key: SOLR-14516
 URL: https://issues.apache.org/jira/browse/SOLR-14516
 Project: Solr
  Issue Type: Task
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Noble Paul


 
o.a.s.s.HttpSolrCall null:java.lang.NullPointerException\n\tat 
org.apache.solr.common.util.JsonTextWriter.writeStr(JsonTextWriter.java:83)\n\tat
 org.apache.solr.schema.StrField.write(StrField.java:101)\n\tat 
org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:124)\n\tat
 
org.apache.solr.response.JSONWriter.writeSolrDocument(JSONWriter.java:106)\n\tat
 
org.apache.solr.response.TextResponseWriter.writeSolrDocumentList(TextResponseWriter.java:170)\n\tat
 
org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:147)\n\tat
 
org.apache.solr.common.util.JsonTextWriter.writeNamedListAsMapWithDups(JsonTextWriter.java:386)\n\tat
 
org.apache.solr.common.util.JsonTextWriter.writeNamedList(JsonTextWriter.java:292)\n\tat



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Assigned] (SOLR-14516) NPE during Realtime GET

2020-05-26 Thread Noble Paul (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul reassigned SOLR-14516:
-

Assignee: Noble Paul

> NPE during Realtime GET
> ---
>
> Key: SOLR-14516
> URL: https://issues.apache.org/jira/browse/SOLR-14516
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
>  
> o.a.s.s.HttpSolrCall null:java.lang.NullPointerException\n\tat 
> org.apache.solr.common.util.JsonTextWriter.writeStr(JsonTextWriter.java:83)\n\tat
>  org.apache.solr.schema.StrField.write(StrField.java:101)\n\tat 
> org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:124)\n\tat
>  
> org.apache.solr.response.JSONWriter.writeSolrDocument(JSONWriter.java:106)\n\tat
>  
> org.apache.solr.response.TextResponseWriter.writeSolrDocumentList(TextResponseWriter.java:170)\n\tat
>  
> org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:147)\n\tat
>  
> org.apache.solr.common.util.JsonTextWriter.writeNamedListAsMapWithDups(JsonTextWriter.java:386)\n\tat
>  
> org.apache.solr.common.util.JsonTextWriter.writeNamedList(JsonTextWriter.java:292)\n\tat



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14516) NPE during Realtime GET

2020-05-26 Thread Noble Paul (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-14516:
--
Fix Version/s: 8.6

> NPE during Realtime GET
> ---
>
> Key: SOLR-14516
> URL: https://issues.apache.org/jira/browse/SOLR-14516
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
> Fix For: 8.6
>
>
>  
> o.a.s.s.HttpSolrCall null:java.lang.NullPointerException\n\tat 
> org.apache.solr.common.util.JsonTextWriter.writeStr(JsonTextWriter.java:83)\n\tat
>  org.apache.solr.schema.StrField.write(StrField.java:101)\n\tat 
> org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:124)\n\tat
>  
> org.apache.solr.response.JSONWriter.writeSolrDocument(JSONWriter.java:106)\n\tat
>  
> org.apache.solr.response.TextResponseWriter.writeSolrDocumentList(TextResponseWriter.java:170)\n\tat
>  
> org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:147)\n\tat
>  
> org.apache.solr.common.util.JsonTextWriter.writeNamedListAsMapWithDups(JsonTextWriter.java:386)\n\tat
>  
> org.apache.solr.common.util.JsonTextWriter.writeNamedList(JsonTextWriter.java:292)\n\tat



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14516) NPE during Realtime GET

2020-05-26 Thread Noble Paul (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-14516:
--
Issue Type: Bug  (was: Task)

> NPE during Realtime GET
> ---
>
> Key: SOLR-14516
> URL: https://issues.apache.org/jira/browse/SOLR-14516
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
> Fix For: 8.6
>
>
>  
> o.a.s.s.HttpSolrCall null:java.lang.NullPointerException\n\tat 
> org.apache.solr.common.util.JsonTextWriter.writeStr(JsonTextWriter.java:83)\n\tat
>  org.apache.solr.schema.StrField.write(StrField.java:101)\n\tat 
> org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:124)\n\tat
>  
> org.apache.solr.response.JSONWriter.writeSolrDocument(JSONWriter.java:106)\n\tat
>  
> org.apache.solr.response.TextResponseWriter.writeSolrDocumentList(TextResponseWriter.java:170)\n\tat
>  
> org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:147)\n\tat
>  
> org.apache.solr.common.util.JsonTextWriter.writeNamedListAsMapWithDups(JsonTextWriter.java:386)\n\tat
>  
> org.apache.solr.common.util.JsonTextWriter.writeNamedList(JsonTextWriter.java:292)\n\tat



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] murblanc commented on a change in pull request #1528: SOLR-12823: remove /clusterstate.json

2020-05-26 Thread GitBox



murblanc commented on a change in pull request #1528:
URL: https://github.com/apache/lucene-solr/pull/1528#discussion_r430012046



##
File path: solr/core/src/java/org/apache/solr/cloud/ZkController.java
##
@@ -491,6 +494,41 @@ public boolean isClosed() {
 assert ObjectReleaseTracker.track(this);
   }
 
+  /**
+   * Verifies if /clusterstate.json exists in Zookeepeer, and if it does 
and is not empty, refuses to start and outputs
+   * a helpful message regarding collection migration.
+   *
+   * If /clusterstate.json exists and is empty, it is removed.
+   */
+  private void checkNoOldClusterstate(final SolrZkClient zkClient) throws 
InterruptedException {
+try {
+  if (!zkClient.exists(ZkStateReader.UNSUPPORTED_CLUSTER_STATE, true)) {
+return;
+  }
+
+  final byte[] data = 
zkClient.getData(ZkStateReader.UNSUPPORTED_CLUSTER_STATE, null, null, true);
+
+  if (Arrays.equals("{}".getBytes(StandardCharsets.UTF_8), data)) {
+// Empty json. This log will only occur once.
+log.warn("{} no longer supported starting with Solr 9. Found empty 
file on Zookeeper, deleting it.", ZkStateReader.UNSUPPORTED_CLUSTER_STATE);
+zkClient.delete(ZkStateReader.UNSUPPORTED_CLUSTER_STATE, -1, true);

Review comment:
   Looking at pre PR master branch.
   
   The watcher on /clusterstate.json is an instance of 
LegacyClusterStateWatcher (subclass of ZkStateReader).
   The watcher processing is done in refreshAndWatch() that calls 
ZkStateReader.refreshLegacyClusterState() and does some exception handling.
   
   Even though refreshAndWatch() handles KeeperException.NoNodeException by 
throwing a SolrException SERVICE_UNAVAILABLE, this never happens: 
refreshLegacyClusterState() catches that exception, a comment says "Ignore 
missing legacy clusterstate.json." and the catch builds what would be an empty 
clusterstate.
   
   We should be fine.

##
File path: solr/solrj/src/java/org/apache/solr/common/cloud/ClusterState.java
##
@@ -210,47 +200,42 @@ public boolean liveNodesContain(String name) {
   @Override
   public String toString() {
 StringBuilder sb = new StringBuilder();
-sb.append("znodeVersion: ").append(znodeVersion);
-sb.append("\n");
 sb.append("live nodes:").append(liveNodes);
 sb.append("\n");
 sb.append("collections:").append(collectionStates);
 return sb.toString();
   }
 
-  public static ClusterState load(Integer version, byte[] bytes, Set 
liveNodes) {
-return load(version, bytes, liveNodes, ZkStateReader.CLUSTER_STATE);
-  }
   /**
-   * Create ClusterState from json string that is typically stored in 
zookeeper.
+   * Create a ClusterState from Json.
* 
-   * @param version zk version of the clusterstate.json file (bytes)
-   * @param bytes clusterstate.json as a byte array
+   * @param bytes a byte array of a Json representation of a mapping from 
collection name to the Json representation of a
+   *  {@link DocCollection} as written by {@link 
#write(JSONWriter)}. It can represent
+   *  one or more collections.
* @param liveNodes list of live nodes
* @return the ClusterState
*/
-  public static ClusterState load(Integer version, byte[] bytes, Set 
liveNodes, String znode) {
-// System.out.println(" ClusterState.load:" + (bytes==null ? null 
: new String(bytes)));
+  public static ClusterState createFromJson(int version, byte[] bytes, 
Set liveNodes) {
 if (bytes == null || bytes.length == 0) {
-  return new ClusterState(version, liveNodes, Collections.emptyMap());
+  return new ClusterState(liveNodes, Collections.emptyMap());
 }
 Map stateMap = (Map) Utils.fromJSON(bytes);
-return load(version, stateMap, liveNodes, znode);
+return createFromData(version, stateMap, liveNodes);
   }
 
-  public static ClusterState load(Integer version, Map 
stateMap, Set liveNodes, String znode) {
+  public static ClusterState createFromData(int version, Map 
stateMap, Set liveNodes) {

Review comment:
   `createFromCollectionMap`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] romseygeek commented on a change in pull request #1537: LUCENE-9381: Add SortOrder interface

2020-05-26 Thread GitBox



romseygeek commented on a change in pull request #1537:
URL: https://github.com/apache/lucene-solr/pull/1537#discussion_r430424273



##
File path: 
lucene/backward-codecs/src/test/org/apache/lucene/codecs/lucene70/Lucene70RWSegmentInfoFormat.java
##
@@ -89,7 +89,10 @@ public void write(Directory dir, SegmentInfo si, IOContext 
ioContext) throws IOE
   int numSortFields = indexSort == null ? 0 : indexSort.getSort().length;
   output.writeVInt(numSortFields);
   for (int i = 0; i < numSortFields; ++i) {
-SortField sortField = indexSort.getSort()[i];
+if (indexSort.getSort()[i] instanceof SortField == false) {

Review comment:
   This is test code for a deprecated segment info format, so users should 
never hit this in production - might be better as an assertion?

##
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriterConfig.java
##
@@ -463,16 +465,65 @@ public IndexWriterConfig setCommitOnClose(boolean 
commitOnClose) {
* Set the {@link Sort} order to use for all (flushed and merged) segments.
*/
   public IndexWriterConfig setIndexSort(Sort sort) {
-for (SortField sortField : sort.getSort()) {
+for (SortOrder sortField : sort.getSort()) {
   if (sortField.getIndexSorter() == null) {
 throw new IllegalArgumentException("Cannot sort index with sort field 
" + sortField);
   }
 }
 this.indexSort = sort;
-this.indexSortFields = 
Arrays.stream(sort.getSort()).map(SortField::getField).collect(Collectors.toSet());
+this.indexSortFields = extractFields(sort);
 return this;
   }
 
+  private Set extractFields(Sort sort) {
+Set fields = new HashSet<>();
+for (SortOrder sortOrder : sort.getSort()) {
+  IndexSorter sorter = sortOrder.getIndexSorter();
+  assert sorter != null;
+  try {
+sorter.getDocComparator(new DocValuesLeafReader() {

Review comment:
   Right, because some sorts refer to multiple fields, or to none.  For 
example, if we want to add an IndexSorter to expression sorts, we need to check 
that none of the DV fields referred to in the expression are updateable.

##
File path: lucene/core/src/java/org/apache/lucene/search/Sort.java
##
@@ -22,80 +22,7 @@
 
 
 /**
- * Encapsulates sort criteria for returned hits.
- *
- * The fields used to determine sort order must be carefully chosen.

Review comment:
   ++

##
File path: lucene/core/src/java/org/apache/lucene/search/SortOrder.java
##
@@ -0,0 +1,141 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.search;
+
+import java.io.IOException;
+
+import org.apache.lucene.index.IndexSorter;
+import org.apache.lucene.index.SortFieldProvider;
+
+/**
+ * Defines an ordering for documents within an index
+ */
+public interface SortOrder {
+
+  /**
+   * Returns whether the sort should be reversed.
+   */
+  boolean getReverse();
+
+  /**
+   * Returns the {@link FieldComparator} to use for sorting.
+   *
+   * @param numHits   number of top hits the queue will store
+   * @param sortPos   position of this SortField within {@link Sort}.  The 
comparator is primary
+   *  if sortPos==0, secondary if sortPos==1, etc.  Some 
comparators can
+   *  optimize themselves when they are the primary sort.
+   */
+  FieldComparator getComparator(int numHits, int sortPos);
+
+  /**
+   * Whether the relevance score is needed to sort documents.
+   */
+  boolean needsScores();
+
+  /**
+   * A name for the sort order
+   */
+  default String name() {
+return toString();
+  }
+
+  /**
+   * Rewrites this SortOrder, returning a new SortOrder if a change is made.
+   *
+   * @param searcher IndexSearcher to use during rewriting
+   * @return New rewritten SortOrder, or {@code this} if nothing has changed.
+   */
+  default SortOrder rewrite(IndexSearcher searcher) throws IOException {
+return this;
+  }
+
+  /**
+   * Returns an {@link IndexSorter} used for sorting index segments by this 
SortField.
+   *
+   * If the SortField cannot be used for index sorting (for example, if it 
uses scores or
+   * other query-dependent values) then this method should return

[GitHub] [lucene-solr] mayya-sharipova commented on a change in pull request #1351: LUCENE-9280: Collectors to skip noncompetitive documents

2020-05-26 Thread GitBox



mayya-sharipova commented on a change in pull request #1351:
URL: https://github.com/apache/lucene-solr/pull/1351#discussion_r430677209



##
File path: 
lucene/core/src/test/org/apache/lucene/search/TestSortOptimization.java
##
@@ -0,0 +1,294 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search;
+
+import org.apache.lucene.document.Document;
+import org.apache.lucene.document.FloatDocValuesField;
+import org.apache.lucene.document.LongPoint;
+import org.apache.lucene.document.IntPoint;
+import org.apache.lucene.document.FloatPoint;
+import org.apache.lucene.document.NumericDocValuesField;
+import org.apache.lucene.index.DirectoryReader;
+import org.apache.lucene.index.IndexReader;
+import org.apache.lucene.index.IndexWriter;
+import org.apache.lucene.index.IndexWriterConfig;
+import org.apache.lucene.store.Directory;
+import org.apache.lucene.util.LuceneTestCase;
+
+import java.io.IOException;
+
+import static org.apache.lucene.search.SortField.FIELD_SCORE;
+
+public class TestSortOptimization extends LuceneTestCase {

Review comment:
   @jpountz Thank you for the feedback,  all last comments are addressed in 
1ab2a6e.  
   I have also created a 
[LUCENE-9384](https://issues.apache.org/jira/browse/LUCENE-9384)  for 
backporting to 8.x.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14516) NPE during Realtime GET

2020-05-26 Thread Noble Paul (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-14516:
--
Description: 
The exact reason is unknown. But the following is the stacktrace
 
 o.a.s.s.HttpSolrCall null:java.lang.NullPointerException\n\tat 
org.apache.solr.common.util.JsonTextWriter.writeStr(JsonTextWriter.java:83)\n\tat
 org.apache.solr.schema.StrField.write(StrField.java:101)\n\tat 
org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:124)\n\tat
 
org.apache.solr.response.JSONWriter.writeSolrDocument(JSONWriter.java:106)\n\tat
 
org.apache.solr.response.TextResponseWriter.writeSolrDocumentList(TextResponseWriter.java:170)\n\tat
 
org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:147)\n\tat
 
org.apache.solr.common.util.JsonTextWriter.writeNamedListAsMapWithDups(JsonTextWriter.java:386)\n\tat
 
org.apache.solr.common.util.JsonTextWriter.writeNamedList(JsonTextWriter.java:292)\n\tat

  was:
 
o.a.s.s.HttpSolrCall null:java.lang.NullPointerException\n\tat 
org.apache.solr.common.util.JsonTextWriter.writeStr(JsonTextWriter.java:83)\n\tat
 org.apache.solr.schema.StrField.write(StrField.java:101)\n\tat 
org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:124)\n\tat
 
org.apache.solr.response.JSONWriter.writeSolrDocument(JSONWriter.java:106)\n\tat
 
org.apache.solr.response.TextResponseWriter.writeSolrDocumentList(TextResponseWriter.java:170)\n\tat
 
org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:147)\n\tat
 
org.apache.solr.common.util.JsonTextWriter.writeNamedListAsMapWithDups(JsonTextWriter.java:386)\n\tat
 
org.apache.solr.common.util.JsonTextWriter.writeNamedList(JsonTextWriter.java:292)\n\tat


> NPE during Realtime GET
> ---
>
> Key: SOLR-14516
> URL: https://issues.apache.org/jira/browse/SOLR-14516
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
> Fix For: 8.6
>
>
> The exact reason is unknown. But the following is the stacktrace
>  
>  o.a.s.s.HttpSolrCall null:java.lang.NullPointerException\n\tat 
> org.apache.solr.common.util.JsonTextWriter.writeStr(JsonTextWriter.java:83)\n\tat
>  org.apache.solr.schema.StrField.write(StrField.java:101)\n\tat 
> org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:124)\n\tat
>  
> org.apache.solr.response.JSONWriter.writeSolrDocument(JSONWriter.java:106)\n\tat
>  
> org.apache.solr.response.TextResponseWriter.writeSolrDocumentList(TextResponseWriter.java:170)\n\tat
>  
> org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:147)\n\tat
>  
> org.apache.solr.common.util.JsonTextWriter.writeNamedListAsMapWithDups(JsonTextWriter.java:386)\n\tat
>  
> org.apache.solr.common.util.JsonTextWriter.writeNamedList(JsonTextWriter.java:292)\n\tat



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1537: LUCENE-9381: Add SortOrder interface

2020-05-26 Thread GitBox



mikemccand commented on a change in pull request #1537:
URL: https://github.com/apache/lucene-solr/pull/1537#discussion_r430367067



##
File path: lucene/core/src/java/org/apache/lucene/search/Sort.java
##
@@ -22,80 +22,7 @@
 
 
 /**
- * Encapsulates sort criteria for returned hits.
- *
- * The fields used to determine sort order must be carefully chosen.

Review comment:
   Wow, these docs were quite stale!  They hark from the `FieldCache` days. 
 Wow, 2004!
   
   I think instead of fully deleting them, we should just updating them to 
state that you index the corresponding typed doc values field, and then sort by 
that type?

##
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriterConfig.java
##
@@ -463,16 +465,65 @@ public IndexWriterConfig setCommitOnClose(boolean 
commitOnClose) {
* Set the {@link Sort} order to use for all (flushed and merged) segments.
*/
   public IndexWriterConfig setIndexSort(Sort sort) {
-for (SortField sortField : sort.getSort()) {
+for (SortOrder sortField : sort.getSort()) {
   if (sortField.getIndexSorter() == null) {
 throw new IllegalArgumentException("Cannot sort index with sort field 
" + sortField);
   }
 }
 this.indexSort = sort;
-this.indexSortFields = 
Arrays.stream(sort.getSort()).map(SortField::getField).collect(Collectors.toSet());
+this.indexSortFields = extractFields(sort);
 return this;
   }
 
+  private Set extractFields(Sort sort) {
+Set fields = new HashSet<>();
+for (SortOrder sortOrder : sort.getSort()) {
+  IndexSorter sorter = sortOrder.getIndexSorter();
+  assert sorter != null;
+  try {
+sorter.getDocComparator(new DocValuesLeafReader() {

Review comment:
   Hmm this is a little bit scary hackity, compared to what we had before :)
   
   Are you wanting to not add a `SortOrder.getField()`?

##
File path: 
lucene/backward-codecs/src/test/org/apache/lucene/codecs/lucene70/Lucene70RWSegmentInfoFormat.java
##
@@ -89,7 +89,10 @@ public void write(Directory dir, SegmentInfo si, IOContext 
ioContext) throws IOE
   int numSortFields = indexSort == null ? 0 : indexSort.getSort().length;
   output.writeVInt(numSortFields);
   for (int i = 0; i < numSortFields; ++i) {
-SortField sortField = indexSort.getSort()[i];
+if (indexSort.getSort()[i] instanceof SortField == false) {

Review comment:
   Hmm, maybe `indexSort` should just continue to take `SortField` for now? 
 Else this is sort of trappy -- only when `IndexWriter` goes to flush a segment 
to disk, will we (late) hit this exception?
   
   Or, maybe we could do this check instead in `IndexWriterConfig`?

##
File path: lucene/core/src/java/org/apache/lucene/search/SortField.java
##
@@ -172,8 +180,9 @@ public SortField readSortField(DataInput in) throws 
IOException {
 }
 
 @Override
-public void writeSortField(SortField sf, DataOutput out) throws 
IOException {
-  sf.serialize(out);
+public void writeSortField(SortOrder sf, DataOutput out) throws 
IOException {
+  assert sf instanceof SortField;

Review comment:
   Shouldn't this be a real check?  Caller could legitimately mess this up?

##
File path: lucene/core/src/java/org/apache/lucene/search/SortOrder.java
##
@@ -0,0 +1,141 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.lucene.search;
+
+import java.io.IOException;
+
+import org.apache.lucene.index.IndexSorter;
+import org.apache.lucene.index.SortFieldProvider;
+
+/**
+ * Defines an ordering for documents within an index
+ */
+public interface SortOrder {
+
+  /**
+   * Returns whether the sort should be reversed.
+   */
+  boolean getReverse();
+
+  /**
+   * Returns the {@link FieldComparator} to use for sorting.
+   *
+   * @param numHits   number of top hits the queue will store
+   * @param sortPos   position of this SortField within {@link Sort}.  The 
comparator is primary
+   *  if sortPos==0, secondary if sortPos==1, etc.  Some 
comparators can
+   *  optimize themselves when they are the primary sort.
+   */
+  FieldComparator getComparator

[GitHub] [lucene-solr] romseygeek commented on pull request #1537: LUCENE-9381: Add SortOrder interface

2020-05-26 Thread GitBox



romseygeek commented on pull request #1537:
URL: https://github.com/apache/lucene-solr/pull/1537#issuecomment-634034654


   I should explain more clearly what I'm trying to do here :)
   
   `SortField` as currently constituted is a weird mix of concrete 
implementations and abstractions.  We have a set of implementations for the 
various numeric types, plus a keyword index based sort, score, and doc id, 
which are selected for via a SortType enum.  But this isn't extensible, so then 
we have an extra 'custom' type which anything that doesn't fit into these 
categories should use.  We also define everything as being based on a field 
(hence the name, plus `getField()`), but some sorts are based on multiple 
fields, or on something else entirely - score and doc return `null` from 
`getField()`, for example, or sorts based on expressions return the unparsed 
expression string.  So `SortField` needs an overhaul.  But, it's also used a 
lot in client code, and I wanted to keep a measure of backwards compatibility 
here.
   
   My idea was to add this `SortOrder` abstraction, and then convert everything 
that currently uses a SortField.CUSTOM implementation to instead just return a 
plain SortOrder.  For the remaining SortField types, I think it would be worth 
looking instead returning specialised SortOrders from factory methods on 
NumericDocValuesField, SortedDocValuesField, SortedSetDocValuesField, etc.  
SortField stays, but is deprecated, so clients currently building sorts by 
directly instantiating a SortField get pointed towards these new factory 
methods.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] murblanc commented on pull request #1504: SOLR-14462: cache more than one autoscaling session

2020-05-26 Thread GitBox



murblanc commented on pull request #1504:
URL: https://github.com/apache/lucene-solr/pull/1504#issuecomment-633664134


   @noblepaul will you be able to have a look at this PR?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1527: SOLR-14384 Stack SolrRequestInfo

2020-05-26 Thread GitBox



dsmiley commented on a change in pull request #1527:
URL: https://github.com/apache/lucene-solr/pull/1527#discussion_r430017058



##
File path: solr/core/src/java/org/apache/solr/request/SolrRequestInfo.java
##
@@ -38,7 +40,13 @@
 
 
 public class SolrRequestInfo {
-  protected final static ThreadLocal threadLocal = new 
ThreadLocal<>();
+
+  protected final static int capacity = 150;

Review comment:
   I suggest debugging/investigating there to understand _why_.  Maybe this 
cap has exposed an actual problem?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dsmiley commented on pull request #1527: SOLR-14384 Stack SolrRequestInfo

2020-05-26 Thread GitBox



dsmiley commented on pull request #1527:
URL: https://github.com/apache/lucene-solr/pull/1527#issuecomment-633650898


   Perhaps the reset method should furthermore have an assertion to check that 
the stack is already empty?  If we decide that reset() is there as a safety 
measure that isn't to be used instead of clear(), then the assertion can alert 
us to problems (in tests) rather than hiding them.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on a change in pull request #1351: LUCENE-9280: Collectors to skip noncompetitive documents

2020-05-26 Thread GitBox



jpountz commented on a change in pull request #1351:
URL: https://github.com/apache/lucene-solr/pull/1351#discussion_r429949584



##
File path: 
lucene/core/src/java/org/apache/lucene/search/FilteringNumericComparator.java
##
@@ -0,0 +1,49 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search;
+
+import org.apache.lucene.index.LeafReaderContext;
+
+import java.io.IOException;
+
+/**
+ * A wrapper over {@code NumericComparator} that provides a leaf comparator 
that can filter non-competitive docs.
+ */
+public class FilteringNumericComparator extends 
FilteringFieldComparator {
+  public FilteringNumericComparator(NumericComparator in, boolean reverse, 
boolean singleSort) {
+super(in, reverse, singleSort);
+  }
+
+  @Override
+  public final FilteringLeafFieldComparator 
getLeafComparator(LeafReaderContext context) throws IOException {
+((NumericComparator) in).doSetNextReader(context);

Review comment:
   this relies on the implementation detail that NumericComparator extends 
SimpleFieldCompatator, can we instead call `LeafFieldComparator 
inLeafComparator = in.getLeafComparator(context);` and then apply the below if 
statements over `inLeafComparator` rather than `in`?

##
File path: 
lucene/core/src/java/org/apache/lucene/search/FilteringNumericComparator.java
##
@@ -0,0 +1,49 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search;
+
+import org.apache.lucene.index.LeafReaderContext;
+
+import java.io.IOException;
+
+/**
+ * A wrapper over {@code NumericComparator} that provides a leaf comparator 
that can filter non-competitive docs.
+ */
+public class FilteringNumericComparator extends 
FilteringFieldComparator {
+  public FilteringNumericComparator(NumericComparator in, boolean reverse, 
boolean singleSort) {
+super(in, reverse, singleSort);
+  }
+
+  @Override
+  public final FilteringLeafFieldComparator 
getLeafComparator(LeafReaderContext context) throws IOException {
+((NumericComparator) in).doSetNextReader(context);
+if (in instanceof FieldComparator.LongComparator) {
+  return new 
FilteringNumericLeafComparator.FilteringLongLeafComparator((FieldComparator.LongComparator)
 in, context,
+  ((LongComparator) in).field, reverse, singleSort, hasTopValue);
+} else if (in instanceof FieldComparator.IntComparator) {
+  return new 
FilteringNumericLeafComparator.FilteringIntLeafComparator((FieldComparator.IntComparator)
 in, context,
+  ((IntComparator) in).field, reverse, singleSort, hasTopValue);
+} else if (in instanceof FieldComparator.DoubleComparator) {
+  return new 
FilteringNumericLeafComparator.FilteringDoubleLeafComparator((FieldComparator.DoubleComparator)
 in, context,
+  ((DoubleComparator) in).field, reverse, singleSort, hasTopValue);
+} else { // instanceof FieldComparator.FloatComparator

Review comment:
   can you actually do the instanceof check so that it would be more future 
proof? E.g. in case we add support for bfloat16 one day.

##
File path: lucene/core/src/java/org/apache/lucene/search/TopFieldCollector.java
##
@@ -302,18 +327,24 @@ private TopFieldCollector(FieldValueHitQueue pq, 
int numHits,
 this.numHits = numHits;
 this.hitsThresholdChecker = hitsThresholdChecker;
 this.numComparators = pq.getComparators().length;
-FieldComparator fieldComparator = pq.getComp

[jira] [Commented] (LUCENE-9123) JapaneseTokenizer with search mode doesn't work with SynonymGraphFilter

2020-05-26 Thread Trejkaz (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117364#comment-17117364
 ] 

Trejkaz commented on LUCENE-9123:
-

Noticing this in the changelog, and wondering whether this is a solution or 
workaround for LUCENE-5905 as well.


> JapaneseTokenizer with search mode doesn't work with SynonymGraphFilter
> ---
>
> Key: LUCENE-9123
> URL: https://issues.apache.org/jira/browse/LUCENE-9123
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/analysis
>Reporter: Kazuaki Hiraga
>Assignee: Tomoko Uchida
>Priority: Major
> Fix For: master (9.0), 8.5
>
> Attachments: LUCENE-9123.patch, LUCENE-9123_8x.patch
>
>
> JapaneseTokenizer with `mode=search` or `mode=extended` doesn't work with 
> both of SynonymGraphFilter and SynonymFilter when JT generates multiple 
> tokens as an output. If we use `mode=normal`, it should be fine. However, we 
> would like to use decomposed tokens that can maximize to chance to increase 
> recall.
> Snippet of schema:
> {code:xml}
>  positionIncrementGap="100" autoGeneratePhraseQueries="false">
>   
> 
>  synonyms="lang/synonyms_ja.txt"
> tokenizerFactory="solr.JapaneseTokenizerFactory"/>
> 
> 
>  tags="lang/stoptags_ja.txt" />
> 
> 
> 
> 
> 
>  minimumLength="4"/>
> 
> 
>   
> 
> {code}
> An synonym entry that generates error:
> {noformat}
> 株式会社,コーポレーション
> {noformat}
> The following is an output on console:
> {noformat}
> $ ./bin/solr create_core -c jp_test -d ../config/solrconfs
> ERROR: Error CREATEing SolrCore 'jp_test': Unable to create core [jp_test3] 
> Caused by: term: 株式会社 analyzed to a token (株式会社) with position increment != 1 
> (got: 0)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] romseygeek opened a new pull request #1537: LUCENE-9381: Add SortOrder interface

2020-05-26 Thread GitBox



romseygeek opened a new pull request #1537:
URL: https://github.com/apache/lucene-solr/pull/1537


   This commit extracts a new SortOrder interface from SortField, containing 
only
   those methods that are directly necessary to implement a sort.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

48 matches

Mail list logo