[jira] [Commented] (LUCENE-9077) Gradle build

2020-05-27 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117445#comment-17117445
 ] 

Dawid Weiss commented on LUCENE-9077:
-

Avoiding the manifest is not the right approach. It is a *great* thing that 
gradle detects that something has changed and follows the task chain of 
corresponding updates - if that timestamp in manifest is something required 
then it should be propagated.

I understand this may be a problem with frequent builds and there are 
workarounds that can be applied but cache avoidance or exclusion is not the 
right way. The current manifest contains these fields that may vary from build 
to build:
{code:java}
"${project.version} ${gitRev} - ${System.properties['user.name']} - 
${buildDate} ${buildTime}" {code}
I'd say my preferred way of solving this would be to:

1) remove buildTime entirely or truncate it to, say, an hour. This makes full 
rebuilds of JARs trigger automatically every longer period of time and still 
leaves all the information. Git revision and build date should be sufficient. 
The time to re-assemble JARs is fairly small anyway.

2) make the implementationVersion line different for production and snapshot 
builds. Snapshot builds could only include project version so that jars remain 
fairly constant between rebuilds.

3) only add implementationVersion field to jars if we run together with a 
different task (something like top-level "release"). I don't like this because 
it adds cross-task dependencies and complexity that will make people scratch 
their heads.

 

> Gradle build
> 
>
> Key: LUCENE-9077
> URL: https://issues.apache.org/jira/browse/LUCENE-9077
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: LUCENE-9077-javadoc-locale-en-US.patch
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> This task focuses on providing gradle-based build equivalent for Lucene and 
> Solr (on master branch). See notes below on why this respin is needed.
> The code lives on *gradle-master* branch. It is kept with sync with *master*. 
> Try running the following to see an overview of helper guides concerning 
> typical workflow, testing and ant-migration helpers:
> gradlew :help
> A list of items that needs to be added or requires work. If you'd like to 
> work on any of these, please add your name to the list. Once you have a 
> patch/ pull request let me (dweiss) know - I'll try to coordinate the merges.
>  * (/) Apply forbiddenAPIs
>  * (/) Generate hardware-aware gradle defaults for parallelism (count of 
> workers and test JVMs).
>  * (/) Fail the build if --tests filter is applied and no tests execute 
> during the entire build (this allows for an empty set of filtered tests at 
> single project level).
>  * (/) Port other settings and randomizations from common-build.xml
>  * (/) Configure security policy/ sandboxing for tests.
>  * (/) test's console output on -Ptests.verbose=true
>  * (/) add a :helpDeps explanation to how the dependency system works 
> (palantir plugin, lockfile) and how to retrieve structured information about 
> current dependencies of a given module (in a tree-like output).
>  * (/) jar checksums, jar checksum computation and validation. This should be 
> done without intermediate folders (directly on dependency sets).
>  * (/) verify min. JVM version and exact gradle version on build startup to 
> minimize odd build side-effects
>  * (/) Repro-line for failed tests/ runs.
>  * (/) add a top-level README note about building with gradle (and the 
> required JVM).
>  * (/) add an equivalent of 'validate-source-patterns' 
> (check-source-patterns.groovy) to precommit.
>  * (/) add an equivalent of 'rat-sources' to precommit.
>  * (/) add an equivalent of 'check-example-lucene-match-version' (solr only) 
> to precommit.
>  * (/) javadoc compilation
> Hard-to-implement stuff already investigated:
>  * (/) (done)  -*Printing console output of failed tests.* There doesn't seem 
> to be any way to do this in a reasonably efficient way. There are onOutput 
> listeners but they're slow to operate and solr tests emit *tons* of output so 
> it's an overkill.-
>  * (!) (LUCENE-9120) *Tests working with security-debug logs or other 
> JVM-early log output*. Gradle's test runner works by redirecting Java's 
> stdout/ syserr so this just won't work. Perhaps we can spin the ant-based 
> test runner for such corner-cases.
> Of lesser importance:
>  * Add an equivalent of 'documentation-lint" to precommit.
>  * (/) Do not require files to be committed before running precommit. (staged 
> files are fine).
>  * (/) add rendering of javadocs (gradlew javadoc)
>  * Attach javadocs to maven publications.
>  * Add test 'beasting'

[jira] [Commented] (LUCENE-9077) Gradle build

2020-05-27 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117452#comment-17117452
 ] 

Dawid Weiss commented on LUCENE-9077:
-

This is related to LUCENE-9301 - let's move the discussion there.

> Gradle build
> 
>
> Key: LUCENE-9077
> URL: https://issues.apache.org/jira/browse/LUCENE-9077
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: LUCENE-9077-javadoc-locale-en-US.patch
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> This task focuses on providing gradle-based build equivalent for Lucene and 
> Solr (on master branch). See notes below on why this respin is needed.
> The code lives on *gradle-master* branch. It is kept with sync with *master*. 
> Try running the following to see an overview of helper guides concerning 
> typical workflow, testing and ant-migration helpers:
> gradlew :help
> A list of items that needs to be added or requires work. If you'd like to 
> work on any of these, please add your name to the list. Once you have a 
> patch/ pull request let me (dweiss) know - I'll try to coordinate the merges.
>  * (/) Apply forbiddenAPIs
>  * (/) Generate hardware-aware gradle defaults for parallelism (count of 
> workers and test JVMs).
>  * (/) Fail the build if --tests filter is applied and no tests execute 
> during the entire build (this allows for an empty set of filtered tests at 
> single project level).
>  * (/) Port other settings and randomizations from common-build.xml
>  * (/) Configure security policy/ sandboxing for tests.
>  * (/) test's console output on -Ptests.verbose=true
>  * (/) add a :helpDeps explanation to how the dependency system works 
> (palantir plugin, lockfile) and how to retrieve structured information about 
> current dependencies of a given module (in a tree-like output).
>  * (/) jar checksums, jar checksum computation and validation. This should be 
> done without intermediate folders (directly on dependency sets).
>  * (/) verify min. JVM version and exact gradle version on build startup to 
> minimize odd build side-effects
>  * (/) Repro-line for failed tests/ runs.
>  * (/) add a top-level README note about building with gradle (and the 
> required JVM).
>  * (/) add an equivalent of 'validate-source-patterns' 
> (check-source-patterns.groovy) to precommit.
>  * (/) add an equivalent of 'rat-sources' to precommit.
>  * (/) add an equivalent of 'check-example-lucene-match-version' (solr only) 
> to precommit.
>  * (/) javadoc compilation
> Hard-to-implement stuff already investigated:
>  * (/) (done)  -*Printing console output of failed tests.* There doesn't seem 
> to be any way to do this in a reasonably efficient way. There are onOutput 
> listeners but they're slow to operate and solr tests emit *tons* of output so 
> it's an overkill.-
>  * (!) (LUCENE-9120) *Tests working with security-debug logs or other 
> JVM-early log output*. Gradle's test runner works by redirecting Java's 
> stdout/ syserr so this just won't work. Perhaps we can spin the ant-based 
> test runner for such corner-cases.
> Of lesser importance:
>  * Add an equivalent of 'documentation-lint" to precommit.
>  * (/) Do not require files to be committed before running precommit. (staged 
> files are fine).
>  * (/) add rendering of javadocs (gradlew javadoc)
>  * Attach javadocs to maven publications.
>  * Add test 'beasting' (rerunning the same suite multiple times). I'm afraid 
> it'll be difficult to run it sensibly because gradle doesn't offer cwd 
> separation for the forked test runners.
>  * if you diff solr packaged distribution against ant-created distribution 
> there are minor differences in library versions and some JARs are excluded/ 
> moved around. I didn't try to force these as everything seems to work (tests, 
> etc.) – perhaps these differences should  be fixed in the ant build instead.
>  * (/) identify and port various "regenerate" tasks from ant builds (javacc, 
> precompiled automata, etc.)
>  * Fill in POM details in gradle/defaults-maven.gradle so that they reflect 
> the previous content better (dependencies aside).
>  * Add any IDE integration layers that should be added (I use IntelliJ and it 
> imports the project out of the box, without the need for any special tuning).
>  ** Remove idea task and just import from the gradle model? One less thing to 
> maintain.
>  * Add Solr packaging for docs/* (see TODO in packaging/build.gradle; 
> currently XSLT...)
>  * I didn't bother adding Solr dist/test-framework to packaging (who'd use it 
> from a binary distribution? 
>  * There is some python execution in check-broken-links and 
> check-missing-javadocs, not sure if it's been ported
>  * Nightly-smoke also 

[jira] [Reopened] (LUCENE-9301) Gradle: Jar MANIFEST incomplete

2020-05-27 Thread Dawid Weiss (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss reopened LUCENE-9301:
-

D. Smiley expressed concerns that with the manifest including timestamps JARs 
are reassembled every time on rebuild (they should be because their inputs 
change).

I understand this may be a problem with frequent builds and there are 
workarounds that can be applied but cache avoidance or exclusion is not the 
right way. The current manifest contains these fields that may vary from build 
to build:
{code:java}
"${project.version} ${gitRev} - ${System.properties['user.name']} - 
${buildDate} ${buildTime}" {code}
I'd say my preferred way of solving this would be to:

1) remove buildTime entirely or truncate it to, say, an hour. This makes full 
rebuilds of JARs trigger automatically every longer period of time and still 
leaves all the information. Git revision and build date should be sufficient. 
The time to re-assemble JARs is fairly small anyway.

2) make the implementationVersion line different for production and snapshot 
builds. Snapshot builds could only include project version so that jars remain 
fairly constant between rebuilds.

3) only add implementationVersion field to jars if we run together with a 
different task (something like top-level "release"). I don't like this because 
it adds cross-task dependencies and complexity that will make people scratch 
their heads.

> Gradle: Jar MANIFEST incomplete
> ---
>
> Key: LUCENE-9301
> URL: https://issues.apache.org/jira/browse/LUCENE-9301
> Project: Lucene - Core
>  Issue Type: Sub-task
>  Components: general/build
>Affects Versions: master (9.0)
>Reporter: Jan Høydahl
>Assignee: Dawid Weiss
>Priority: Major
> Fix For: master (9.0)
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> After building with gradle, the MANIFEST.MF file for e.g. solr-core.jar 
> containst
> {noformat}
> Manifest-Version: 1.0
> {noformat}
> While when building with ant, it says
> {noformat}
> Manifest-Version: 1.0
> Ant-Version: Apache Ant 1.10.7
> Created-By: 11.0.6+10 (AdoptOpenJDK)
> Extension-Name: org.apache.solr
> Specification-Title: Apache Solr Search Server: solr-core
> Specification-Version: 9.0.0
> Specification-Vendor: The Apache Software Foundation
> Implementation-Title: org.apache.solr
> Implementation-Version: 9.0.0-SNAPSHOT 9b5542ad55da601e0bdfda96bad8c2c
>  cabbbc397 - janhoy - 2020-04-01 16:24:09
> Implementation-Vendor: The Apache Software Foundation
> X-Compile-Source-JDK: 11
> X-Compile-Target-JDK: 11
> {noformat}
> In addition, with ant, the META-INF folder also contains LICENSE.txt and 
> NOTICE.txt files.
> There is a macro {{build-manifest}} in common-build.xml that seems to build 
> the manifest.
> The effect of this is e.g. that spec an implementation versions do not show 
> in Solr Admin UI



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9301) Gradle: Jar MANIFEST incomplete

2020-05-27 Thread Jira


[ 
https://issues.apache.org/jira/browse/LUCENE-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117489#comment-17117489
 ] 

Jan Høydahl commented on LUCENE-9301:
-

2) looks like a good compromise.

> Gradle: Jar MANIFEST incomplete
> ---
>
> Key: LUCENE-9301
> URL: https://issues.apache.org/jira/browse/LUCENE-9301
> Project: Lucene - Core
>  Issue Type: Sub-task
>  Components: general/build
>Affects Versions: master (9.0)
>Reporter: Jan Høydahl
>Assignee: Dawid Weiss
>Priority: Major
> Fix For: master (9.0)
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> After building with gradle, the MANIFEST.MF file for e.g. solr-core.jar 
> containst
> {noformat}
> Manifest-Version: 1.0
> {noformat}
> While when building with ant, it says
> {noformat}
> Manifest-Version: 1.0
> Ant-Version: Apache Ant 1.10.7
> Created-By: 11.0.6+10 (AdoptOpenJDK)
> Extension-Name: org.apache.solr
> Specification-Title: Apache Solr Search Server: solr-core
> Specification-Version: 9.0.0
> Specification-Vendor: The Apache Software Foundation
> Implementation-Title: org.apache.solr
> Implementation-Version: 9.0.0-SNAPSHOT 9b5542ad55da601e0bdfda96bad8c2c
>  cabbbc397 - janhoy - 2020-04-01 16:24:09
> Implementation-Vendor: The Apache Software Foundation
> X-Compile-Source-JDK: 11
> X-Compile-Target-JDK: 11
> {noformat}
> In addition, with ant, the META-INF folder also contains LICENSE.txt and 
> NOTICE.txt files.
> There is a macro {{build-manifest}} in common-build.xml that seems to build 
> the manifest.
> The effect of this is e.g. that spec an implementation versions do not show 
> in Solr Admin UI



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] iverase opened a new pull request #1538: LUCENE-9368: Use readLELongs to read docIds on BKD leaf nodes

2020-05-27 Thread GitBox


iverase opened a new pull request #1538:
URL: https://github.com/apache/lucene-solr/pull/1538


   This change changes the way we read docIds from the index using readLELongs 
so we can read most of the docs in one batch into a temporary array. This 
allows the compiler to run more efficient loops as well as we are copying 
information between two arrays.
   
   In addition we add two need encoding paths for int8 and int16.
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] iverase commented on pull request #1503: LUCENE-9368: SIMD-based decoding of BKD docIds

2020-05-27 Thread GitBox


iverase commented on pull request #1503:
URL: https://github.com/apache/lucene-solr/pull/1503#issuecomment-634558069


   I am going to close this in favour of 
https://github.com/apache/lucene-solr/pull/1538 as it less intrusive and 
performance test shows even better behaviour.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] iverase closed pull request #1503: LUCENE-9368: SIMD-based decoding of BKD docIds

2020-05-27 Thread GitBox


iverase closed pull request #1503:
URL: https://github.com/apache/lucene-solr/pull/1503


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9301) Gradle: Jar MANIFEST incomplete

2020-05-27 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117599#comment-17117599
 ] 

Uwe Schindler commented on LUCENE-9301:
---

I would nuke the datestamp from the manifest. The version and gitrev don't 
change while developing only when you commit (and then it's fine to change it).

But the date/time should not be part of the Manifest, unless it's part of a 
separate Manifest entry that can be excluded from changes checks by Gradle. 
IMHO, the whole discussion goes into the direction. The date/time is already in 
file metadata of each jar entry, why repeat it? Gradle knows how to handle that 
as it ignores the file dates.

Another thing to look into: With timestamps, builds are not reproducible. They 
aren't (JAR is always different because of ZIP entry dates), but other projects 
already set a fixed date for the ZIP file entries to make builds reproducible. 
Also the JAR task should sort the files in consistent order - OpenJDK is 
working on that, but custom tools like Gradle may build JAR files by their own 
code.

> Gradle: Jar MANIFEST incomplete
> ---
>
> Key: LUCENE-9301
> URL: https://issues.apache.org/jira/browse/LUCENE-9301
> Project: Lucene - Core
>  Issue Type: Sub-task
>  Components: general/build
>Affects Versions: master (9.0)
>Reporter: Jan Høydahl
>Assignee: Dawid Weiss
>Priority: Major
> Fix For: master (9.0)
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> After building with gradle, the MANIFEST.MF file for e.g. solr-core.jar 
> containst
> {noformat}
> Manifest-Version: 1.0
> {noformat}
> While when building with ant, it says
> {noformat}
> Manifest-Version: 1.0
> Ant-Version: Apache Ant 1.10.7
> Created-By: 11.0.6+10 (AdoptOpenJDK)
> Extension-Name: org.apache.solr
> Specification-Title: Apache Solr Search Server: solr-core
> Specification-Version: 9.0.0
> Specification-Vendor: The Apache Software Foundation
> Implementation-Title: org.apache.solr
> Implementation-Version: 9.0.0-SNAPSHOT 9b5542ad55da601e0bdfda96bad8c2c
>  cabbbc397 - janhoy - 2020-04-01 16:24:09
> Implementation-Vendor: The Apache Software Foundation
> X-Compile-Source-JDK: 11
> X-Compile-Target-JDK: 11
> {noformat}
> In addition, with ant, the META-INF folder also contains LICENSE.txt and 
> NOTICE.txt files.
> There is a macro {{build-manifest}} in common-build.xml that seems to build 
> the manifest.
> The effect of this is e.g. that spec an implementation versions do not show 
> in Solr Admin UI



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9301) Gradle: Jar MANIFEST incomplete

2020-05-27 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117599#comment-17117599
 ] 

Uwe Schindler edited comment on LUCENE-9301 at 5/27/20, 10:02 AM:
--

I would nuke the datestamp from the manifest. The version and gitrev don't 
change while developing only when you commit (and then it's fine to change it).

But the date/time should not be part of the Manifest, unless it's part of a 
separate Manifest entry that can be excluded from changes checks by Gradle. 
IMHO, the whole discussion goes into the direction: The date/time is already in 
file metadata of each jar entry, why repeat it? Gradle knows how to handle that 
as it ignores the file dates.

Another thing to look into: With timestamps, builds are not reproducible. They 
aren't also without our Manifest changes (JAR is always different because of 
ZIP entry dates), but other projects already started to set a fixed date for 
the ZIP file entries to make builds reproducible. Also the JAR task should sort 
the files in consistent order - OpenJDK is working on that, but custom tools 
like Gradle may build JAR files by their own code.


was (Author: thetaphi):
I would nuke the datestamp from the manifest. The version and gitrev don't 
change while developing only when you commit (and then it's fine to change it).

But the date/time should not be part of the Manifest, unless it's part of a 
separate Manifest entry that can be excluded from changes checks by Gradle. 
IMHO, the whole discussion goes into the direction. The date/time is already in 
file metadata of each jar entry, why repeat it? Gradle knows how to handle that 
as it ignores the file dates.

Another thing to look into: With timestamps, builds are not reproducible. They 
aren't (JAR is always different because of ZIP entry dates), but other projects 
already set a fixed date for the ZIP file entries to make builds reproducible. 
Also the JAR task should sort the files in consistent order - OpenJDK is 
working on that, but custom tools like Gradle may build JAR files by their own 
code.

> Gradle: Jar MANIFEST incomplete
> ---
>
> Key: LUCENE-9301
> URL: https://issues.apache.org/jira/browse/LUCENE-9301
> Project: Lucene - Core
>  Issue Type: Sub-task
>  Components: general/build
>Affects Versions: master (9.0)
>Reporter: Jan Høydahl
>Assignee: Dawid Weiss
>Priority: Major
> Fix For: master (9.0)
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> After building with gradle, the MANIFEST.MF file for e.g. solr-core.jar 
> containst
> {noformat}
> Manifest-Version: 1.0
> {noformat}
> While when building with ant, it says
> {noformat}
> Manifest-Version: 1.0
> Ant-Version: Apache Ant 1.10.7
> Created-By: 11.0.6+10 (AdoptOpenJDK)
> Extension-Name: org.apache.solr
> Specification-Title: Apache Solr Search Server: solr-core
> Specification-Version: 9.0.0
> Specification-Vendor: The Apache Software Foundation
> Implementation-Title: org.apache.solr
> Implementation-Version: 9.0.0-SNAPSHOT 9b5542ad55da601e0bdfda96bad8c2c
>  cabbbc397 - janhoy - 2020-04-01 16:24:09
> Implementation-Vendor: The Apache Software Foundation
> X-Compile-Source-JDK: 11
> X-Compile-Target-JDK: 11
> {noformat}
> In addition, with ant, the META-INF folder also contains LICENSE.txt and 
> NOTICE.txt files.
> There is a macro {{build-manifest}} in common-build.xml that seems to build 
> the manifest.
> The effect of this is e.g. that spec an implementation versions do not show 
> in Solr Admin UI



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9301) Gradle: Jar MANIFEST incomplete

2020-05-27 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117644#comment-17117644
 ] 

Dawid Weiss commented on LUCENE-9301:
-

{quote}bq. The date/time is already in file metadata of each jar entry, why 
repeat it? Gradle knows how to handle that as it ignores the file dates.
{quote}
I agree - I don't see any particular reason to include it.

> Gradle: Jar MANIFEST incomplete
> ---
>
> Key: LUCENE-9301
> URL: https://issues.apache.org/jira/browse/LUCENE-9301
> Project: Lucene - Core
>  Issue Type: Sub-task
>  Components: general/build
>Affects Versions: master (9.0)
>Reporter: Jan Høydahl
>Assignee: Dawid Weiss
>Priority: Major
> Fix For: master (9.0)
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> After building with gradle, the MANIFEST.MF file for e.g. solr-core.jar 
> containst
> {noformat}
> Manifest-Version: 1.0
> {noformat}
> While when building with ant, it says
> {noformat}
> Manifest-Version: 1.0
> Ant-Version: Apache Ant 1.10.7
> Created-By: 11.0.6+10 (AdoptOpenJDK)
> Extension-Name: org.apache.solr
> Specification-Title: Apache Solr Search Server: solr-core
> Specification-Version: 9.0.0
> Specification-Vendor: The Apache Software Foundation
> Implementation-Title: org.apache.solr
> Implementation-Version: 9.0.0-SNAPSHOT 9b5542ad55da601e0bdfda96bad8c2c
>  cabbbc397 - janhoy - 2020-04-01 16:24:09
> Implementation-Vendor: The Apache Software Foundation
> X-Compile-Source-JDK: 11
> X-Compile-Target-JDK: 11
> {noformat}
> In addition, with ant, the META-INF folder also contains LICENSE.txt and 
> NOTICE.txt files.
> There is a macro {{build-manifest}} in common-build.xml that seems to build 
> the manifest.
> The effect of this is e.g. that spec an implementation versions do not show 
> in Solr Admin UI



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14280) SolrConfig logging not helpful

2020-05-27 Thread Andras Salamon (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117669#comment-17117669
 ] 

Andras Salamon commented on SOLR-14280:
---

[~gerlowskija] Yes, [~erickerickson] also suggested me this. I have uploaded 
the first patch before that, but I'll not use suffixes in my new patches.

[~erickerickson] I'm happy to volunteer for this, should I open a LUCENE or a 
SOLR jira?

Never thought this one-liner (OK, three-liner at the end) patch will be that 
popular. :)  

> SolrConfig logging not helpful
> --
>
> Key: SOLR-14280
> URL: https://issues.apache.org/jira/browse/SOLR-14280
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andras Salamon
>Assignee: Jason Gerlowski
>Priority: Minor
> Fix For: master (9.0), 8.6
>
> Attachments: SOLR-14280-01.patch, SOLR-14280-02.patch, getmessages.txt
>
>
> SolrConfig prints out a warning message if it's not able to add files to the 
> classpath, but this message is not too helpful:
> {noformat}
> o.a.s.c.SolrConfig Couldn't add files from 
> /opt/cloudera/parcels/CDH-7.1.1-1.cdh7.1.1.p0.1850855/lib/solr/dist filtered 
> by solr-langid-\d.*\.jar to classpath: 
> /opt/cloudera/parcels/CDH-7.1.1-1.cdh7.1.1.p0.1850855/lib/solr/
> dist {noformat}
> The reason should be at the end of the log message but it's just repeats the 
> problematic file name.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14280) SolrConfig logging not helpful

2020-05-27 Thread Erick Erickson (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117796#comment-17117796
 ] 

Erick Erickson commented on SOLR-14280:
---

[~asalamon74] Please open up a new JIRA. @-mention me and/or Jason so we know 
now to track. What IDE do you use? 'cause I can give you a hack of the 
validateLoggingCalls gradle build file that makes finding all these easy, at 
least in IntelliJ. Let me know.

What's unclear to me is if any of them should remain just getMessage() or 
should print out the entire stack trace.

> SolrConfig logging not helpful
> --
>
> Key: SOLR-14280
> URL: https://issues.apache.org/jira/browse/SOLR-14280
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andras Salamon
>Assignee: Jason Gerlowski
>Priority: Minor
> Fix For: master (9.0), 8.6
>
> Attachments: SOLR-14280-01.patch, SOLR-14280-02.patch, getmessages.txt
>
>
> SolrConfig prints out a warning message if it's not able to add files to the 
> classpath, but this message is not too helpful:
> {noformat}
> o.a.s.c.SolrConfig Couldn't add files from 
> /opt/cloudera/parcels/CDH-7.1.1-1.cdh7.1.1.p0.1850855/lib/solr/dist filtered 
> by solr-langid-\d.*\.jar to classpath: 
> /opt/cloudera/parcels/CDH-7.1.1-1.cdh7.1.1.p0.1850855/lib/solr/
> dist {noformat}
> The reason should be at the end of the log message but it's just repeats the 
> problematic file name.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14467) inconsistent server errors combining relatedness() with allBuckets:true

2020-05-27 Thread Michael Gibney (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Gibney updated SOLR-14467:
--
Attachment: SOLR-14467.patch

> inconsistent server errors combining relatedness() with allBuckets:true
> ---
>
> Key: SOLR-14467
> URL: https://issues.apache.org/jira/browse/SOLR-14467
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-14467.patch, SOLR-14467.patch, 
> SOLR-14467_test.patch, SOLR-14467_test.patch
>
>
> While working on randomized testing for SOLR-13132 i discovered a variety of 
> different ways that JSON Faceting's "allBuckets" option can fail when 
> combined with the "relatedness()" function.
> I haven't found a trivial way to manual reproduce this, but i have been able 
> to trigger the failures with a trivial patch to {{TestCloudJSONFacetSKG}} 
> which i will attach.
> Based on the nature of the failures it looks like it may have something to do 
> with multiple segments of different sizes, and or resizing the SlotAccs ?
> The relatedness() function doesn't have much (any?) existing tests in place 
> that leverage "allBuckets" so this is probably a bug that has always existed 
> -- it's possible it may be excessively cumbersome to fix and we might 
> nee/wnat to just document that incompatibility and add some code to try and 
> detect if the user combines these options and if so fail with a 400 error?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14467) inconsistent server errors combining relatedness() with allBuckets:true

2020-05-27 Thread Michael Gibney (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117822#comment-17117822
 ] 

Michael Gibney commented on SOLR-14467:
---

I just uploaded a patch that I think covers everything we discussed above. 
There's a nocommit marking an edge case problem for distributed merging. I 
think this happens for the case buckets are merged without any shard having 
initialized a bucket. If all the shards are implied (counts of zero), then 
there's no way to know whether the term is allBuckets or a term bucket, and 
it's not possible to choose the proper {{externalize(boolean)}} implementation.

... although, as I write this, I _think_ maybe we _can_ choose ... but I'm not 
sure how to test/verify this. I'm thinking that any term bucket that is merged 
is guaranteed to have at least one "materialized" (i.e., not "implied/empty") 
{{BucketData}}, which would identify that mergeResult as being for a term 
bucket, not allBuckets. But the {{allBuckets}} bucket is present (and gets 
merged) regardless of whether there's any "materialized" content there, so if 
all the merged buckets are all "implied/empty", perhaps we can infer that we're 
dealing with the "allBuckets" bucket?

> inconsistent server errors combining relatedness() with allBuckets:true
> ---
>
> Key: SOLR-14467
> URL: https://issues.apache.org/jira/browse/SOLR-14467
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-14467.patch, SOLR-14467.patch, 
> SOLR-14467_test.patch, SOLR-14467_test.patch
>
>
> While working on randomized testing for SOLR-13132 i discovered a variety of 
> different ways that JSON Faceting's "allBuckets" option can fail when 
> combined with the "relatedness()" function.
> I haven't found a trivial way to manual reproduce this, but i have been able 
> to trigger the failures with a trivial patch to {{TestCloudJSONFacetSKG}} 
> which i will attach.
> Based on the nature of the failures it looks like it may have something to do 
> with multiple segments of different sizes, and or resizing the SlotAccs ?
> The relatedness() function doesn't have much (any?) existing tests in place 
> that leverage "allBuckets" so this is probably a bug that has always existed 
> -- it's possible it may be excessively cumbersome to fix and we might 
> nee/wnat to just document that incompatibility and add some code to try and 
> detect if the user combines these options and if so fail with a 400 error?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14467) inconsistent server errors combining relatedness() with allBuckets:true

2020-05-27 Thread Michael Gibney (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117826#comment-17117826
 ] 

Michael Gibney commented on SOLR-14467:
---

More specifically, I'm thinking we might be able to do something like this?:
{code:java}
public Object getMergedResult() {
  final BucketData bucketData;
  switch (bucketType) {
default:
  // NOTE: TERM buckets would always have at least one shard with a 
"materialized" bucket, so if
  // all the buckets are implied/empty, we can infer we're dealing with 
the allBuckets bucket.
case ALL_BUCKET:
  bucketData = ALL_BUCKETS_SPECIAL_BUCKET_DATA;
  break;
case TERM:
  bucketData = mergedData;
  break;
  }
  return bucketData.externalize(false);
}
  }
{code}

> inconsistent server errors combining relatedness() with allBuckets:true
> ---
>
> Key: SOLR-14467
> URL: https://issues.apache.org/jira/browse/SOLR-14467
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-14467.patch, SOLR-14467.patch, 
> SOLR-14467_test.patch, SOLR-14467_test.patch
>
>
> While working on randomized testing for SOLR-13132 i discovered a variety of 
> different ways that JSON Faceting's "allBuckets" option can fail when 
> combined with the "relatedness()" function.
> I haven't found a trivial way to manual reproduce this, but i have been able 
> to trigger the failures with a trivial patch to {{TestCloudJSONFacetSKG}} 
> which i will attach.
> Based on the nature of the failures it looks like it may have something to do 
> with multiple segments of different sizes, and or resizing the SlotAccs ?
> The relatedness() function doesn't have much (any?) existing tests in place 
> that leverage "allBuckets" so this is probably a bug that has always existed 
> -- it's possible it may be excessively cumbersome to fix and we might 
> nee/wnat to just document that incompatibility and add some code to try and 
> detect if the user combines these options and if so fail with a 400 error?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14280) SolrConfig logging not helpful

2020-05-27 Thread Erick Erickson (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117796#comment-17117796
 ] 

Erick Erickson edited comment on SOLR-14280 at 5/27/20, 3:14 PM:
-

[~asalamon74] Please open up a new JIRA. @-mention me and/or Jason so we can 
track. What IDE do you use? 'cause I can give you a hack of the 
validateLoggingCalls gradle build file that makes finding all these easy, at 
least in IntelliJ. Actually, it wouldn't be a hack, if you clean all of these 
up it'll be permanent. I just glanced at the text file I pasted on this JIRA 
and don't quite know why it reported the message that's commented out

What's unclear to me is if any of them should remain just getMessage() or 
should print out the entire stack trace. If some of them should _stay_ 
getMessage/Cause(), add //logok to the line and the validation will not report 
the issue.

I suppose we need two JIRAs, one for Lucene and one for Solr. Do note that the 
Lucene ones are in Luke (except the one that's commented out) . Other than 
Luke, Lucene doesn't use logging.


was (Author: erickerickson):
[~asalamon74] Please open up a new JIRA. @-mention me and/or Jason so we know 
now to track. What IDE do you use? 'cause I can give you a hack of the 
validateLoggingCalls gradle build file that makes finding all these easy, at 
least in IntelliJ. Let me know.

What's unclear to me is if any of them should remain just getMessage() or 
should print out the entire stack trace.

> SolrConfig logging not helpful
> --
>
> Key: SOLR-14280
> URL: https://issues.apache.org/jira/browse/SOLR-14280
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andras Salamon
>Assignee: Jason Gerlowski
>Priority: Minor
> Fix For: master (9.0), 8.6
>
> Attachments: SOLR-14280-01.patch, SOLR-14280-02.patch, getmessages.txt
>
>
> SolrConfig prints out a warning message if it's not able to add files to the 
> classpath, but this message is not too helpful:
> {noformat}
> o.a.s.c.SolrConfig Couldn't add files from 
> /opt/cloudera/parcels/CDH-7.1.1-1.cdh7.1.1.p0.1850855/lib/solr/dist filtered 
> by solr-langid-\d.*\.jar to classpath: 
> /opt/cloudera/parcels/CDH-7.1.1-1.cdh7.1.1.p0.1850855/lib/solr/
> dist {noformat}
> The reason should be at the end of the log message but it's just repeats the 
> problematic file name.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on pull request #1538: LUCENE-9368: Use readLELongs to read docIds on BKD leaf nodes

2020-05-27 Thread GitBox


mikemccand commented on pull request #1538:
URL: https://github.com/apache/lucene-solr/pull/1538#issuecomment-634749641


   This looks promising @iverase!  Do you have benchmark results for this 
approach?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14474) Fix remaining auxilliary class warnings in Solr

2020-05-27 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117898#comment-17117898
 ] 

ASF subversion and git services commented on SOLR-14474:


Commit 473185ad06801ad444d7377a531396b51c106591 in lucene-solr's branch 
refs/heads/branch_8x from Erick Erickson
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=473185a ]

SOLR-14474: Fix remaining auxilliary class warnings in Solr


> Fix remaining auxilliary class warnings in Solr
> ---
>
> Key: SOLR-14474
> URL: https://issues.apache.org/jira/browse/SOLR-14474
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> We have quite a number of situations where multiple classes are declared in a 
> single source file, which is a poor practice. I ran across a bunch of these 
> in solr/core, and [~mdrob] fixed some of these in SOLR-14426. [~dsmiley] 
> looked at those and thought that it would have been better to just move a 
> particular class to its own file. And [~uschindler] do you have any comments?
> I have a fork with a _bunch_ of changes to get warnings out that include 
> moving more than a few classes into static inner classes, including the one 
> Mike did. I do NOT intend to commit this, it's too big/sprawling, but it does 
> serve to show a variety of situations. See: 
> https://github.com/ErickErickson/lucene-solr/tree/jira/SOLR-10810 for how 
> ugly it all looks. I intend to break this wodge down into smaller tasks and 
> start over now that I have a clue as to the scope. And do ignore the generics 
> changes as well as the consequences of upgrading apache commons CLI, those 
> need to be their own JIRA.
> What I'd like to do is agree on some guidelines for when to move classes to 
> their own file and when to move them to static inner classes.
> Some things I saw, reference the fork for the changes (again, I won't check 
> that in).
> 1> DocValuesAcc has no fewer than 9 classes that could be moved inside the 
> main class. But they all become "static abstract". And take 
> "DoubleSortedNumericDVAcc" in that class, It gets extended over in 4 other 
> files. How would all that get resolved? How many of them would people 
> recommend moving into their own files? Do we want to proliferate all those? 
> And so on with all the other plethora of classes in 
> org.apache.solr.search.facet.
> This is particularly thorny because the choices would be about a zillion new 
> classes or about a zillion edits.
> Does the idea of abstract .vs. concrete classes make any difference? IOW, if 
> we change an abstract class to a nested class, then maybe we just have to 
> change the class(es) that extend it?
> 2> StatsComponent.StatsInfo probably should be its own file?
> 3> FloatCmp, LongCmp, DoubleCmp all declare classes with "Comp" rather than 
> "Cmp". Those files should just be renamed.
> 4> JSONResponseWriter. ???
> 5> FacetRangeProcessor seems like it needs its own class
> 6> FacetRequestSorted seems like it needs its own class
> 7> FacetModule
> So what I'd like going forward is to agree on some guidelines to resolve 
> whether to move a class to its own file or make it nested (probably static). 
> Not hard-and-fast rules, just something to cut down on the rework due to 
> objections.
> And what about backporting to 8x? My suggestion is to backport what's 
> easy/doesn't break back-compat in order to make keeping the two branches in 
> sync.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14474) Fix remaining auxilliary class warnings in Solr

2020-05-27 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117899#comment-17117899
 ] 

ASF subversion and git services commented on SOLR-14474:


Commit 07a9b5d1b0eb06adbb6993de2ee615a09609ac90 in lucene-solr's branch 
refs/heads/master from Erick Erickson
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=07a9b5d ]

SOLR-14474: Fix remaining auxilliary class warnings in Solr


> Fix remaining auxilliary class warnings in Solr
> ---
>
> Key: SOLR-14474
> URL: https://issues.apache.org/jira/browse/SOLR-14474
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> We have quite a number of situations where multiple classes are declared in a 
> single source file, which is a poor practice. I ran across a bunch of these 
> in solr/core, and [~mdrob] fixed some of these in SOLR-14426. [~dsmiley] 
> looked at those and thought that it would have been better to just move a 
> particular class to its own file. And [~uschindler] do you have any comments?
> I have a fork with a _bunch_ of changes to get warnings out that include 
> moving more than a few classes into static inner classes, including the one 
> Mike did. I do NOT intend to commit this, it's too big/sprawling, but it does 
> serve to show a variety of situations. See: 
> https://github.com/ErickErickson/lucene-solr/tree/jira/SOLR-10810 for how 
> ugly it all looks. I intend to break this wodge down into smaller tasks and 
> start over now that I have a clue as to the scope. And do ignore the generics 
> changes as well as the consequences of upgrading apache commons CLI, those 
> need to be their own JIRA.
> What I'd like to do is agree on some guidelines for when to move classes to 
> their own file and when to move them to static inner classes.
> Some things I saw, reference the fork for the changes (again, I won't check 
> that in).
> 1> DocValuesAcc has no fewer than 9 classes that could be moved inside the 
> main class. But they all become "static abstract". And take 
> "DoubleSortedNumericDVAcc" in that class, It gets extended over in 4 other 
> files. How would all that get resolved? How many of them would people 
> recommend moving into their own files? Do we want to proliferate all those? 
> And so on with all the other plethora of classes in 
> org.apache.solr.search.facet.
> This is particularly thorny because the choices would be about a zillion new 
> classes or about a zillion edits.
> Does the idea of abstract .vs. concrete classes make any difference? IOW, if 
> we change an abstract class to a nested class, then maybe we just have to 
> change the class(es) that extend it?
> 2> StatsComponent.StatsInfo probably should be its own file?
> 3> FloatCmp, LongCmp, DoubleCmp all declare classes with "Comp" rather than 
> "Cmp". Those files should just be renamed.
> 4> JSONResponseWriter. ???
> 5> FacetRangeProcessor seems like it needs its own class
> 6> FacetRequestSorted seems like it needs its own class
> 7> FacetModule
> So what I'd like going forward is to agree on some guidelines to resolve 
> whether to move a class to its own file or make it nested (probably static). 
> Not hard-and-fast rules, just something to cut down on the rework due to 
> objections.
> And what about backporting to 8x? My suggestion is to backport what's 
> easy/doesn't break back-compat in order to make keeping the two branches in 
> sync.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-14474) Fix remaining auxilliary class warnings in Solr

2020-05-27 Thread Erick Erickson (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-14474.
---
Fix Version/s: 8.6
   Resolution: Fixed

> Fix remaining auxilliary class warnings in Solr
> ---
>
> Key: SOLR-14474
> URL: https://issues.apache.org/jira/browse/SOLR-14474
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
> Fix For: 8.6
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> We have quite a number of situations where multiple classes are declared in a 
> single source file, which is a poor practice. I ran across a bunch of these 
> in solr/core, and [~mdrob] fixed some of these in SOLR-14426. [~dsmiley] 
> looked at those and thought that it would have been better to just move a 
> particular class to its own file. And [~uschindler] do you have any comments?
> I have a fork with a _bunch_ of changes to get warnings out that include 
> moving more than a few classes into static inner classes, including the one 
> Mike did. I do NOT intend to commit this, it's too big/sprawling, but it does 
> serve to show a variety of situations. See: 
> https://github.com/ErickErickson/lucene-solr/tree/jira/SOLR-10810 for how 
> ugly it all looks. I intend to break this wodge down into smaller tasks and 
> start over now that I have a clue as to the scope. And do ignore the generics 
> changes as well as the consequences of upgrading apache commons CLI, those 
> need to be their own JIRA.
> What I'd like to do is agree on some guidelines for when to move classes to 
> their own file and when to move them to static inner classes.
> Some things I saw, reference the fork for the changes (again, I won't check 
> that in).
> 1> DocValuesAcc has no fewer than 9 classes that could be moved inside the 
> main class. But they all become "static abstract". And take 
> "DoubleSortedNumericDVAcc" in that class, It gets extended over in 4 other 
> files. How would all that get resolved? How many of them would people 
> recommend moving into their own files? Do we want to proliferate all those? 
> And so on with all the other plethora of classes in 
> org.apache.solr.search.facet.
> This is particularly thorny because the choices would be about a zillion new 
> classes or about a zillion edits.
> Does the idea of abstract .vs. concrete classes make any difference? IOW, if 
> we change an abstract class to a nested class, then maybe we just have to 
> change the class(es) that extend it?
> 2> StatsComponent.StatsInfo probably should be its own file?
> 3> FloatCmp, LongCmp, DoubleCmp all declare classes with "Comp" rather than 
> "Cmp". Those files should just be renamed.
> 4> JSONResponseWriter. ???
> 5> FacetRangeProcessor seems like it needs its own class
> 6> FacetRequestSorted seems like it needs its own class
> 7> FacetModule
> So what I'd like going forward is to agree on some guidelines to resolve 
> whether to move a class to its own file or make it nested (probably static). 
> Not hard-and-fast rules, just something to cut down on the rework due to 
> objections.
> And what about backporting to 8x? My suggestion is to backport what's 
> easy/doesn't break back-compat in order to make keeping the two branches in 
> sync.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9380) Fix auxiliary class warnings in Lucene

2020-05-27 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117909#comment-17117909
 ] 

ASF subversion and git services commented on LUCENE-9380:
-

Commit db347b36859e19fdea862b8decda7f7d457841e7 in lucene-solr's branch 
refs/heads/branch_8x from Erick Erickson
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=db347b3 ]

LUCENE-9380: Fix auxiliary class warnings in Lucene


> Fix auxiliary class warnings in Lucene
> --
>
> Key: LUCENE-9380
> URL: https://issues.apache.org/jira/browse/LUCENE-9380
> Project: Lucene - Core
>  Issue Type: Improvement
> Environment: There are only three and they're entirely simple so I'll 
> fix them up. Since they're in Lucene, I thought it should be a separate JIRA.
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9380) Fix auxiliary class warnings in Lucene

2020-05-27 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117910#comment-17117910
 ] 

ASF subversion and git services commented on LUCENE-9380:
-

Commit b576ef6c8cec516aac1777f8ee9c04409eba7c4e in lucene-solr's branch 
refs/heads/master from Erick Erickson
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b576ef6 ]

LUCENE-9380: Fix auxiliary class warnings in Lucene


> Fix auxiliary class warnings in Lucene
> --
>
> Key: LUCENE-9380
> URL: https://issues.apache.org/jira/browse/LUCENE-9380
> Project: Lucene - Core
>  Issue Type: Improvement
> Environment: There are only three and they're entirely simple so I'll 
> fix them up. Since they're in Lucene, I thought it should be a separate JIRA.
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] ErickErickson closed pull request #1535: SOLR-14474: Fix remaining auxilliary class warnings in Solr

2020-05-27 Thread GitBox


ErickErickson closed pull request #1535:
URL: https://github.com/apache/lucene-solr/pull/1535


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] ErickErickson closed pull request #1534: LUCENE-9380: Fix auxiliary class warnings in Lucene

2020-05-27 Thread GitBox


ErickErickson closed pull request #1534:
URL: https://github.com/apache/lucene-solr/pull/1534


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-11334) UnifiedSolrHighlighter returns an error when hl.fl delimited by ", "

2020-05-27 Thread David Smiley (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-11334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-11334:

Issue Type: Improvement  (was: Bug)
  Priority: Minor  (was: Trivial)

> UnifiedSolrHighlighter returns an error when hl.fl delimited by ", "
> 
>
> Key: SOLR-11334
> URL: https://issues.apache.org/jira/browse/SOLR-11334
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Affects Versions: 6.6
> Environment: Ubuntu 17.04 (GNU/Linux 4.10.0-33-generic x86_64)
> Java HotSpot 64-Bit Server VM(build 25.114-b01, mixed mode)
>Reporter: Yasufumi Mizoguchi
>Priority: Minor
> Attachments: SOLR-11334.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> UnifiedSolrHighlighter(hl.method=unified) misjudge the zero-length string as 
> a field name and returns an error when hl.fl delimited by ", "
> request:
> {code}
> $ curl -XGET 
> "http://localhost:8983/solr/techproducts/select?fl=name,%20manu&hl.fl=name,%20manu&hl.method=unified&hl=on&indent=on&q=corsair&wt=json";
> {code}
> response:
> {code}
> {
>   "responseHeader":{
> "status":400,
> "QTime":8,
> "params":{
>   "q":"corsair",
>   "hl":"on",
>   "indent":"on",
>   "fl":"name, manu",
>   "hl.fl":"name, manu",
>   "hl.method":"unified",
>   "wt":"json"}},
>   "response":{"numFound":2,"start":0,"docs":[
>   {
> "name":"CORSAIR ValueSelect 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 
> (PC 3200) System Memory - Retail",
> "manu":"Corsair Microsystems Inc."},
>   {
> "name":"CORSAIR  XMS 2GB (2 x 1GB) 184-Pin DDR SDRAM Unbuffered DDR 
> 400 (PC 3200) Dual Channel Kit System Memory - Retail",
> "manu":"Corsair Microsystems Inc."}]
>   },
>   "error":{
> "metadata":[
>   "error-class","org.apache.solr.common.SolrException",
>   "root-error-class","org.apache.solr.common.SolrException"],
> "msg":"undefined field ",
> "code":400}}
> {code}
> DefaultHighlighter's response:
> {code}
> {
>   "responseHeader":{
> "status":0,
> "QTime":5,
> "params":{
>   "q":"corsair",
>   "hl":"on",
>   "indent":"on",
>   "fl":"name, manu",
>   "hl.fl":"name, manu",
>   "hl.method":"original",
>   "wt":"json"}},
>   "response":{"numFound":2,"start":0,"docs":[
>   {
> "name":"CORSAIR ValueSelect 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 
> (PC 3200) System Memory - Retail",
> "manu":"Corsair Microsystems Inc."},
>   {
> "name":"CORSAIR  XMS 2GB (2 x 1GB) 184-Pin DDR SDRAM Unbuffered DDR 
> 400 (PC 3200) Dual Channel Kit System Memory - Retail",
> "manu":"Corsair Microsystems Inc."}]
>   },
>   "highlighting":{
> "VS1GB400C3":{
>   "name":["CORSAIR ValueSelect 1GB 184-Pin DDR SDRAM Unbuffered 
> DDR 400 (PC 3200) System Memory - Retail"],
>   "manu":["Corsair Microsystems Inc."]},
> "TWINX2048-3200PRO":{
>   "name":["CORSAIR  XMS 2GB (2 x 1GB) 184-Pin DDR SDRAM 
> Unbuffered DDR 400 (PC 3200) Dual Channel Kit System"],
>   "manu":["Corsair Microsystems Inc."]}}}
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob opened a new pull request #1539: Fix typos in release wizard

2020-05-27 Thread GitBox


madrob opened a new pull request #1539:
URL: https://github.com/apache/lucene-solr/pull/1539


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14517) MM local params value is ignored in edismax queries with operators

2020-05-27 Thread Yuriy Koval (Jira)
Yuriy Koval created SOLR-14517:
--

 Summary: MM local params value is ignored in edismax queries with 
operators
 Key: SOLR-14517
 URL: https://issues.apache.org/jira/browse/SOLR-14517
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: query parsers
Affects Versions: 8.4.1
Reporter: Yuriy Koval


When specifying "mm" as a local parameter:

{color:#e01e5a}q=\{!edismax mm="100%25" v=$qq}&qq=foo %2Bbar&rows=0&uf=* 
_query_{color}
 {color:#1d1c1d}is not functionally equivalent to{color}
 {{{color:#e01e5a}q=\{!edismax v=$qq}&qq=foo %2Bbar&rows=0&uf=* 
_query_&mm=100%25{color}}}

 It seems to be caused by the following code in 
{color:#e01e5a}ExtendedDismaxQParser{color}

 
{code:java}
// For correct lucene queries, turn off mm processing if no explicit mm spec 
was provided
// and there were explicit operators (except for AND).
if (query instanceof BooleanQuery) {
 // config.minShouldMatch holds the value of mm which MIGHT have come from the 
user,
 // but could also have been derived from q.op.
 String mmSpec = config.minShouldMatch;

 if (foundOperators(clauses, config.lowercaseOperators)) {
 mmSpec = params.get(DisMaxParams.MM, "0%"); // Use provided mm spec if 
present, otherwise turn off mm processing
 }{code}
 

We need to check if user specified "mm" explicitly. We could change
{code:java}
mmSpec = params.get(DisMaxParams.MM, "0%");
{code}
to
{code:java}
mmSpec = config.solrParams.get(DisMaxParams.MM, "0%");
{code}
so we check local params too.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14498) BlockCache gets stuck not accepting new stores

2020-05-27 Thread Mike Drob (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17118019#comment-17118019
 ] 

Mike Drob commented on SOLR-14498:
--

This causes precommit failures on master, need to update the version in a few 
more places.

{noformat}
> Task :solr:core:validateJarChecksums FAILED

FAILURE: Build failed with an exception.

* Where:
Script '/Users/mdrob/code/lucene-solr/gradle/validation/jar-checks.gradle' 
line: 195

* What went wrong:
Execution failed for task ':solr:core:validateJarChecksums'.
> Dependency checksum validation failed:
- Dependency checksum missing 
('com.github.ben-manes.caffeine:caffeine:2.8.0'), expected it at: 
/Users/mdrob/code/lucene-solr/solr/licenses/caffeine-2.8.0.jar.sha1
{noformat}

{noformat}
mdrob-imp:~/code/lucene-solr $ git grep 2[.]8[.]0
versions.lock:com.github.ben-manes.caffeine:caffeine:2.8.0 (1 constraints: 
0c050d36)
versions.props:com.github.ben-manes.caffeine:caffeine=2.8.0
{noformat}

I'm not actually sure what the correct way to update these is, however.

> BlockCache gets stuck not accepting new stores
> --
>
> Key: SOLR-14498
> URL: https://issues.apache.org/jira/browse/SOLR-14498
> Project: Solr
>  Issue Type: Bug
>  Components: query
>Affects Versions: 6.5, 6.6.5, master (9.0), 7.7.3, 8.5.1
>Reporter: Jakub Zytka
>Assignee: Andrzej Bialecki
>Priority: Major
> Fix For: 8.6
>
>
> {{BlockCache}} uses two components: "storage", i.e. {{banks}} and "eviction 
> mechanism", i.e {{cache}}, implemented by caffeine cache.
> The relation between them is that "storage" enforces a strict limit for the 
> number of entries (
> {{numberOfBlocksPerBank * numberOfBanks}}) whereas the "eviction mechanism" 
> takes care of freeing entries from the storage thanks to {{maximumSize}} set 
> for the caffeine cache to {{numberOfBlocksPerBank * numberOfBanks - 1}}.
> The storage relies on caffeine cache to eventually free at least 1 entry from 
> the storage. If that doesn't happen the {{BlockCache}} starts to fail all new 
> stores.
> As it turns out, the caffeine cache may not reduce it's size to the desired 
> {{maximumSize}} for as long as no {{put}} or {{getIfPresent}} which *finds an 
> entry* is executed.
> With a sufficiently unlucky read pattern, the block cache may be rendered 
> useless (0 hit ratio):
> cache poisoned by non-reusable entries; new, reusable entries are not stored 
> and thus not reused.
> Further info may be found in 
> [https://github.com/ben-manes/caffeine/issues/420]
>  
> Change in caffeine that triggers it's internal cleanup mechanism regardless 
> of whether getIfPresent gets a hit has been implemented in 
> [https://github.com/ben-manes/caffeine/commit/7239bb0dda2af1e7301e8f66a5df28215b5173bc]
> and is due to be released in caffeine 2.8.4



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Reopened] (SOLR-14498) BlockCache gets stuck not accepting new stores

2020-05-27 Thread Mike Drob (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob reopened SOLR-14498:
--

> BlockCache gets stuck not accepting new stores
> --
>
> Key: SOLR-14498
> URL: https://issues.apache.org/jira/browse/SOLR-14498
> Project: Solr
>  Issue Type: Bug
>  Components: query
>Affects Versions: 6.5, 6.6.5, master (9.0), 7.7.3, 8.5.1
>Reporter: Jakub Zytka
>Assignee: Andrzej Bialecki
>Priority: Major
> Fix For: 8.6
>
>
> {{BlockCache}} uses two components: "storage", i.e. {{banks}} and "eviction 
> mechanism", i.e {{cache}}, implemented by caffeine cache.
> The relation between them is that "storage" enforces a strict limit for the 
> number of entries (
> {{numberOfBlocksPerBank * numberOfBanks}}) whereas the "eviction mechanism" 
> takes care of freeing entries from the storage thanks to {{maximumSize}} set 
> for the caffeine cache to {{numberOfBlocksPerBank * numberOfBanks - 1}}.
> The storage relies on caffeine cache to eventually free at least 1 entry from 
> the storage. If that doesn't happen the {{BlockCache}} starts to fail all new 
> stores.
> As it turns out, the caffeine cache may not reduce it's size to the desired 
> {{maximumSize}} for as long as no {{put}} or {{getIfPresent}} which *finds an 
> entry* is executed.
> With a sufficiently unlucky read pattern, the block cache may be rendered 
> useless (0 hit ratio):
> cache poisoned by non-reusable entries; new, reusable entries are not stored 
> and thus not reused.
> Further info may be found in 
> [https://github.com/ben-manes/caffeine/issues/420]
>  
> Change in caffeine that triggers it's internal cleanup mechanism regardless 
> of whether getIfPresent gets a hit has been implemented in 
> [https://github.com/ben-manes/caffeine/commit/7239bb0dda2af1e7301e8f66a5df28215b5173bc]
> and is due to be released in caffeine 2.8.4



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14518) Add support partitioned unique agg in JSON facets.

2020-05-27 Thread Joel Bernstein (Jira)
Joel Bernstein created SOLR-14518:
-

 Summary: Add support partitioned unique agg in JSON facets.
 Key: SOLR-14518
 URL: https://issues.apache.org/jira/browse/SOLR-14518
 Project: Solr
  Issue Type: New Feature
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Joel Bernstein


There are scenarios where documents are partitioned across shards based on the 
same field that the *unique* agg is applied to with JSON facets. In this 
scenario exact counts can be calculated by simply sending the bucket level 
unique counts to the aggregator where they can be merged. Suggested syntax is 
simply to add a boolean flag to the unique function: *unique*(field, true).

The true turns on the "partitioned" unique logic. The default is false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14518) Add support for partitioned unique agg in JSON facets.

2020-05-27 Thread Joel Bernstein (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-14518:
--
Summary: Add support for partitioned unique agg in JSON facets.  (was: Add 
support partitioned unique agg in JSON facets.)

> Add support for partitioned unique agg in JSON facets.
> --
>
> Key: SOLR-14518
> URL: https://issues.apache.org/jira/browse/SOLR-14518
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> There are scenarios where documents are partitioned across shards based on 
> the same field that the *unique* agg is applied to with JSON facets. In this 
> scenario exact counts can be calculated by simply sending the bucket level 
> unique counts to the aggregator where they can be merged. Suggested syntax is 
> simply to add a boolean flag to the unique function: *unique*(field, true).
> The true turns on the "partitioned" unique logic. The default is false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14518) Add support for partitioned unique agg in JSON facets.

2020-05-27 Thread Joel Bernstein (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-14518:
--
Description: 
There are scenarios where documents are partitioned across shards based on the 
same field that the *unique* agg is applied to with JSON facets. In this 
scenario exact counts can be calculated by simply sending the bucket level 
unique counts to the aggregator where they can be merged. Suggested syntax is 
to add a boolean flag to the unique function: *unique*(field, true).

The true turns on the "partitioned" unique logic. The default is false.

  was:
There are scenarios where documents are partitioned across shards based on the 
same field that the *unique* agg is applied to with JSON facets. In this 
scenario exact counts can be calculated by simply sending the bucket level 
unique counts to the aggregator where they can be merged. Suggested syntax is 
simply to add a boolean flag to the unique function: *unique*(field, true).

The true turns on the "partitioned" unique logic. The default is false.


> Add support for partitioned unique agg in JSON facets.
> --
>
> Key: SOLR-14518
> URL: https://issues.apache.org/jira/browse/SOLR-14518
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> There are scenarios where documents are partitioned across shards based on 
> the same field that the *unique* agg is applied to with JSON facets. In this 
> scenario exact counts can be calculated by simply sending the bucket level 
> unique counts to the aggregator where they can be merged. Suggested syntax is 
> to add a boolean flag to the unique function: *unique*(field, true).
> The true turns on the "partitioned" unique logic. The default is false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14518) Add support for partitioned unique agg to JSON facets.

2020-05-27 Thread Joel Bernstein (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-14518:
--
Summary: Add support for partitioned unique agg to JSON facets.  (was: Add 
support for partitioned unique agg in JSON facets.)

> Add support for partitioned unique agg to JSON facets.
> --
>
> Key: SOLR-14518
> URL: https://issues.apache.org/jira/browse/SOLR-14518
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> There are scenarios where documents are partitioned across shards based on 
> the same field that the *unique* agg is applied to with JSON facets. In this 
> scenario exact counts can be calculated by simply sending the bucket level 
> unique counts to the aggregator where they can be merged. Suggested syntax is 
> to add a boolean flag to the unique function: *unique*(field, true).
> The true turns on the "partitioned" unique logic. The default is false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14518) Add support for partitioned unique agg to JSON facets.

2020-05-27 Thread Joel Bernstein (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-14518:
--
Description: 
There are scenarios where documents are partitioned across shards based on the 
same field that the *unique* agg is applied to with JSON facets. In this 
scenario exact counts can be calculated by simply sending the bucket level 
unique counts to the aggregator where they can be merged. Suggested syntax is 
to add a boolean flag to the unique aggregation function: *unique*(field, true).

The true turns on the "partitioned" unique logic. The default is false.

  was:
There are scenarios where documents are partitioned across shards based on the 
same field that the *unique* agg is applied to with JSON facets. In this 
scenario exact counts can be calculated by simply sending the bucket level 
unique counts to the aggregator where they can be merged. Suggested syntax is 
to add a boolean flag to the unique function: *unique*(field, true).

The true turns on the "partitioned" unique logic. The default is false.


> Add support for partitioned unique agg to JSON facets.
> --
>
> Key: SOLR-14518
> URL: https://issues.apache.org/jira/browse/SOLR-14518
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> There are scenarios where documents are partitioned across shards based on 
> the same field that the *unique* agg is applied to with JSON facets. In this 
> scenario exact counts can be calculated by simply sending the bucket level 
> unique counts to the aggregator where they can be merged. Suggested syntax is 
> to add a boolean flag to the unique aggregation function: *unique*(field, 
> true).
> The true turns on the "partitioned" unique logic. The default is false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14518) Add support for partitioned unique agg to JSON facets

2020-05-27 Thread Joel Bernstein (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-14518:
--
Summary: Add support for partitioned unique agg to JSON facets  (was: Add 
support for partitioned unique agg to JSON facets.)

> Add support for partitioned unique agg to JSON facets
> -
>
> Key: SOLR-14518
> URL: https://issues.apache.org/jira/browse/SOLR-14518
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> There are scenarios where documents are partitioned across shards based on 
> the same field that the *unique* agg is applied to with JSON facets. In this 
> scenario exact counts can be calculated by simply sending the bucket level 
> unique counts to the aggregator where they can be merged. Suggested syntax is 
> to add a boolean flag to the unique aggregation function: *unique*(field, 
> true).
> The true turns on the "partitioned" unique logic. The default is false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14518) Add support for partitioned unique agg to JSON facets

2020-05-27 Thread Joel Bernstein (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-14518:
--
Description: 
There are scenarios where documents are partitioned across shards based on the 
same field that the *unique* agg is applied to with JSON facets. In this 
scenario exact unique counts can be calculated by simply sending the bucket 
level unique counts to the aggregator where they can be merged. Suggested 
syntax is to add a boolean flag to the unique aggregation function: 
*unique*(field, true).

The true turns on the "partitioned" unique logic. The default is false.

  was:
There are scenarios where documents are partitioned across shards based on the 
same field that the *unique* agg is applied to with JSON facets. In this 
scenario exact counts can be calculated by simply sending the bucket level 
unique counts to the aggregator where they can be merged. Suggested syntax is 
to add a boolean flag to the unique aggregation function: *unique*(field, true).

The true turns on the "partitioned" unique logic. The default is false.


> Add support for partitioned unique agg to JSON facets
> -
>
> Key: SOLR-14518
> URL: https://issues.apache.org/jira/browse/SOLR-14518
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> There are scenarios where documents are partitioned across shards based on 
> the same field that the *unique* agg is applied to with JSON facets. In this 
> scenario exact unique counts can be calculated by simply sending the bucket 
> level unique counts to the aggregator where they can be merged. Suggested 
> syntax is to add a boolean flag to the unique aggregation function: 
> *unique*(field, true).
> The true turns on the "partitioned" unique logic. The default is false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14518) Add support for partitioned unique agg to JSON facets

2020-05-27 Thread Joel Bernstein (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-14518:
--
Description: 
There are scenarios where documents are partitioned across shards based on the 
same field that the *unique* agg is applied to with JSON facets. In this 
scenario exact unique counts can be calculated by simply sending the bucket 
level unique counts to the aggregator where they can be summed. Suggested 
syntax is to add a boolean flag to the unique aggregation function: 
*unique*(field, true).

The true turns on the "partitioned" unique logic. The default is false.

  was:
There are scenarios where documents are partitioned across shards based on the 
same field that the *unique* agg is applied to with JSON facets. In this 
scenario exact unique counts can be calculated by simply sending the bucket 
level unique counts to the aggregator where they can be merged. Suggested 
syntax is to add a boolean flag to the unique aggregation function: 
*unique*(field, true).

The true turns on the "partitioned" unique logic. The default is false.


> Add support for partitioned unique agg to JSON facets
> -
>
> Key: SOLR-14518
> URL: https://issues.apache.org/jira/browse/SOLR-14518
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> There are scenarios where documents are partitioned across shards based on 
> the same field that the *unique* agg is applied to with JSON facets. In this 
> scenario exact unique counts can be calculated by simply sending the bucket 
> level unique counts to the aggregator where they can be summed. Suggested 
> syntax is to add a boolean flag to the unique aggregation function: 
> *unique*(field, true).
> The true turns on the "partitioned" unique logic. The default is false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14518) Add support for partitioned unique agg to JSON facets

2020-05-27 Thread Joel Bernstein (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-14518:
--
Description: 
There are scenarios where documents are partitioned across shards based on the 
same field that the *unique* agg is applied to with JSON facets. In this 
scenario exact unique counts can be calculated by simply sending the bucket 
level unique counts to the aggregator where they can be summed. Suggested 
syntax is to add a boolean flag to the unique aggregation function: 
*unique*(field, true).

The *true* value turns on the "partitioned" unique logic. The default is false.

  was:
There are scenarios where documents are partitioned across shards based on the 
same field that the *unique* agg is applied to with JSON facets. In this 
scenario exact unique counts can be calculated by simply sending the bucket 
level unique counts to the aggregator where they can be summed. Suggested 
syntax is to add a boolean flag to the unique aggregation function: 
*unique*(field, true).

The true turns on the "partitioned" unique logic. The default is false.


> Add support for partitioned unique agg to JSON facets
> -
>
> Key: SOLR-14518
> URL: https://issues.apache.org/jira/browse/SOLR-14518
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> There are scenarios where documents are partitioned across shards based on 
> the same field that the *unique* agg is applied to with JSON facets. In this 
> scenario exact unique counts can be calculated by simply sending the bucket 
> level unique counts to the aggregator where they can be summed. Suggested 
> syntax is to add a boolean flag to the unique aggregation function: 
> *unique*(field, true).
> The *true* value turns on the "partitioned" unique logic. The default is 
> false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14518) Add support for partitioned unique agg to JSON facets

2020-05-27 Thread Joel Bernstein (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-14518:
--
Description: 
There are scenarios where documents are partitioned across shards based on the 
same field that the *unique* agg is applied to with JSON facets. In this 
scenario exact unique counts can be calculated by simply sending the bucket 
level unique counts to the aggregator where they can be summed. Suggested 
syntax is to add a boolean flag to the unique aggregation function: 
*unique*(partitioned_field, true).

The *true* value turns on the "partitioned" unique logic. The default is false.

  was:
There are scenarios where documents are partitioned across shards based on the 
same field that the *unique* agg is applied to with JSON facets. In this 
scenario exact unique counts can be calculated by simply sending the bucket 
level unique counts to the aggregator where they can be summed. Suggested 
syntax is to add a boolean flag to the unique aggregation function: 
*unique*(field, true).

The *true* value turns on the "partitioned" unique logic. The default is false.


> Add support for partitioned unique agg to JSON facets
> -
>
> Key: SOLR-14518
> URL: https://issues.apache.org/jira/browse/SOLR-14518
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> There are scenarios where documents are partitioned across shards based on 
> the same field that the *unique* agg is applied to with JSON facets. In this 
> scenario exact unique counts can be calculated by simply sending the bucket 
> level unique counts to the aggregator where they can be summed. Suggested 
> syntax is to add a boolean flag to the unique aggregation function: 
> *unique*(partitioned_field, true).
> The *true* value turns on the "partitioned" unique logic. The default is 
> false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14518) Add support for partitioned unique agg to JSON facets

2020-05-27 Thread Joel Bernstein (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-14518:
--
Component/s: Facet Module

> Add support for partitioned unique agg to JSON facets
> -
>
> Key: SOLR-14518
> URL: https://issues.apache.org/jira/browse/SOLR-14518
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Joel Bernstein
>Priority: Major
>
> There are scenarios where documents are partitioned across shards based on 
> the same field that the *unique* agg is applied to with JSON facets. In this 
> scenario exact unique counts can be calculated by simply sending the bucket 
> level unique counts to the aggregator where they can be summed. Suggested 
> syntax is to add a boolean flag to the unique aggregation function: 
> *unique*(partitioned_field, true).
> The *true* value turns on the "partitioned" unique logic. The default is 
> false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14518) Add support for partitioned unique agg to JSON facets

2020-05-27 Thread Mikhail Khludnev (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17118105#comment-17118105
 ] 

Mikhail Khludnev commented on SOLR-14518:
-

[~jbernste] I suppose, second boolean argument doesn't give a clue to the user. 
Even if I guess that's this argument controls this aspect of the algorithm. How 
I can guess whether true or false fits to my need. Also, I want to know if 
{{uniqueBlock(field)}} already covers this usecase? 

> Add support for partitioned unique agg to JSON facets
> -
>
> Key: SOLR-14518
> URL: https://issues.apache.org/jira/browse/SOLR-14518
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Joel Bernstein
>Priority: Major
>
> There are scenarios where documents are partitioned across shards based on 
> the same field that the *unique* agg is applied to with JSON facets. In this 
> scenario exact unique counts can be calculated by simply sending the bucket 
> level unique counts to the aggregator where they can be summed. Suggested 
> syntax is to add a boolean flag to the unique aggregation function: 
> *unique*(partitioned_field, true).
> The *true* value turns on the "partitioned" unique logic. The default is 
> false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] yuriy-b-koval opened a new pull request #1540: SOLR-14517 MM local params value is ignored in edismax queries with operators

2020-05-27 Thread GitBox


yuriy-b-koval opened a new pull request #1540:
URL: https://github.com/apache/lucene-solr/pull/1540


   
   
   
   # Description
   
   Making `mm` specified in local params processing consistent with behavior 
when it was specified as a query param.
   
   # Solution
   If  edismax query has operators, we look up user specified value. Now we 
check both local params and params.
   
   # Tests
   
   Added corresponding test case with mm in local parameter to every existing 
tests case in `TestExtendedDismaxParser#testMinShouldMatchOptional`
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [ ] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [x] I have developed this patch against the `master` branch.
   - [ ] I have run `ant precommit` and the appropriate test suite.
   - [x] I have added tests for my changes.
   - [ ] I have added documentation for the [Ref 
Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) 
(for Solr changes only).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14518) Add support for partitioned unique agg to JSON facets

2020-05-27 Thread Joel Bernstein (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17118174#comment-17118174
 ] 

Joel Bernstein commented on SOLR-14518:
---

To follow the *uniqueBlock* convention we could also create a new function 
called *uniqueShard*. The same basic idea as uniqueBlock just without the need 
for a block join. 

> Add support for partitioned unique agg to JSON facets
> -
>
> Key: SOLR-14518
> URL: https://issues.apache.org/jira/browse/SOLR-14518
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Joel Bernstein
>Priority: Major
>
> There are scenarios where documents are partitioned across shards based on 
> the same field that the *unique* agg is applied to with JSON facets. In this 
> scenario exact unique counts can be calculated by simply sending the bucket 
> level unique counts to the aggregator where they can be summed. Suggested 
> syntax is to add a boolean flag to the unique aggregation function: 
> *unique*(partitioned_field, true).
> The *true* value turns on the "partitioned" unique logic. The default is 
> false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14498) BlockCache gets stuck not accepting new stores

2020-05-27 Thread Erick Erickson (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17118175#comment-17118175
 ] 

Erick Erickson commented on SOLR-14498:
---

[~mdrob] I'll fix this momentarily, I'm sure [~ab] is asleep. The magic is 

if changing depencencies first excute:
gradlew --write-locks

then: 
gradlew updateLicenses

BTW, gradlew helpAnt gives useful commands for "how do I do something in Gradle 
that I used to in Ant"... 


> BlockCache gets stuck not accepting new stores
> --
>
> Key: SOLR-14498
> URL: https://issues.apache.org/jira/browse/SOLR-14498
> Project: Solr
>  Issue Type: Bug
>  Components: query
>Affects Versions: 6.5, 6.6.5, master (9.0), 7.7.3, 8.5.1
>Reporter: Jakub Zytka
>Assignee: Andrzej Bialecki
>Priority: Major
> Fix For: 8.6
>
>
> {{BlockCache}} uses two components: "storage", i.e. {{banks}} and "eviction 
> mechanism", i.e {{cache}}, implemented by caffeine cache.
> The relation between them is that "storage" enforces a strict limit for the 
> number of entries (
> {{numberOfBlocksPerBank * numberOfBanks}}) whereas the "eviction mechanism" 
> takes care of freeing entries from the storage thanks to {{maximumSize}} set 
> for the caffeine cache to {{numberOfBlocksPerBank * numberOfBanks - 1}}.
> The storage relies on caffeine cache to eventually free at least 1 entry from 
> the storage. If that doesn't happen the {{BlockCache}} starts to fail all new 
> stores.
> As it turns out, the caffeine cache may not reduce it's size to the desired 
> {{maximumSize}} for as long as no {{put}} or {{getIfPresent}} which *finds an 
> entry* is executed.
> With a sufficiently unlucky read pattern, the block cache may be rendered 
> useless (0 hit ratio):
> cache poisoned by non-reusable entries; new, reusable entries are not stored 
> and thus not reused.
> Further info may be found in 
> [https://github.com/ben-manes/caffeine/issues/420]
>  
> Change in caffeine that triggers it's internal cleanup mechanism regardless 
> of whether getIfPresent gets a hit has been implemented in 
> [https://github.com/ben-manes/caffeine/commit/7239bb0dda2af1e7301e8f66a5df28215b5173bc]
> and is due to be released in caffeine 2.8.4



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14498) BlockCache gets stuck not accepting new stores

2020-05-27 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17118177#comment-17118177
 ] 

ASF subversion and git services commented on SOLR-14498:


Commit 598cbc5c737b3953fbe0312c011d3de0f2bb58cb in lucene-solr's branch 
refs/heads/master from Erick Erickson
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=598cbc5 ]

SOLR-14498: BlockCache gets stuck not accepting new stores. Fix gradle 
:solr:core:validateJarChecksums


> BlockCache gets stuck not accepting new stores
> --
>
> Key: SOLR-14498
> URL: https://issues.apache.org/jira/browse/SOLR-14498
> Project: Solr
>  Issue Type: Bug
>  Components: query
>Affects Versions: 6.5, 6.6.5, master (9.0), 7.7.3, 8.5.1
>Reporter: Jakub Zytka
>Assignee: Andrzej Bialecki
>Priority: Major
> Fix For: 8.6
>
>
> {{BlockCache}} uses two components: "storage", i.e. {{banks}} and "eviction 
> mechanism", i.e {{cache}}, implemented by caffeine cache.
> The relation between them is that "storage" enforces a strict limit for the 
> number of entries (
> {{numberOfBlocksPerBank * numberOfBanks}}) whereas the "eviction mechanism" 
> takes care of freeing entries from the storage thanks to {{maximumSize}} set 
> for the caffeine cache to {{numberOfBlocksPerBank * numberOfBanks - 1}}.
> The storage relies on caffeine cache to eventually free at least 1 entry from 
> the storage. If that doesn't happen the {{BlockCache}} starts to fail all new 
> stores.
> As it turns out, the caffeine cache may not reduce it's size to the desired 
> {{maximumSize}} for as long as no {{put}} or {{getIfPresent}} which *finds an 
> entry* is executed.
> With a sufficiently unlucky read pattern, the block cache may be rendered 
> useless (0 hit ratio):
> cache poisoned by non-reusable entries; new, reusable entries are not stored 
> and thus not reused.
> Further info may be found in 
> [https://github.com/ben-manes/caffeine/issues/420]
>  
> Change in caffeine that triggers it's internal cleanup mechanism regardless 
> of whether getIfPresent gets a hit has been implemented in 
> [https://github.com/ben-manes/caffeine/commit/7239bb0dda2af1e7301e8f66a5df28215b5173bc]
> and is due to be released in caffeine 2.8.4



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14518) Add support for partitioned unique agg to JSON facets

2020-05-27 Thread Joel Bernstein (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17118178#comment-17118178
 ] 

Joel Bernstein commented on SOLR-14518:
---

There are two ways to co-locate records data on the same shard, one is block 
join and the other is composite id routing. This would provide fast unique 
functionality for those that are using composite id routing.

> Add support for partitioned unique agg to JSON facets
> -
>
> Key: SOLR-14518
> URL: https://issues.apache.org/jira/browse/SOLR-14518
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Joel Bernstein
>Priority: Major
>
> There are scenarios where documents are partitioned across shards based on 
> the same field that the *unique* agg is applied to with JSON facets. In this 
> scenario exact unique counts can be calculated by simply sending the bucket 
> level unique counts to the aggregator where they can be summed. Suggested 
> syntax is to add a boolean flag to the unique aggregation function: 
> *unique*(partitioned_field, true).
> The *true* value turns on the "partitioned" unique logic. The default is 
> false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14518) Add support for partitioned unique agg to JSON facets

2020-05-27 Thread Joel Bernstein (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17118178#comment-17118178
 ] 

Joel Bernstein edited comment on SOLR-14518 at 5/28/20, 12:26 AM:
--

There are two ways to co-locate records on the same shard, one is block join 
and the other is composite id routing. This would provide fast unique 
functionality for those that are using composite id routing.


was (Author: joel.bernstein):
There are two ways to co-locate records data on the same shard, one is block 
join and the other is composite id routing. This would provide fast unique 
functionality for those that are using composite id routing.

> Add support for partitioned unique agg to JSON facets
> -
>
> Key: SOLR-14518
> URL: https://issues.apache.org/jira/browse/SOLR-14518
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Joel Bernstein
>Priority: Major
>
> There are scenarios where documents are partitioned across shards based on 
> the same field that the *unique* agg is applied to with JSON facets. In this 
> scenario exact unique counts can be calculated by simply sending the bucket 
> level unique counts to the aggregator where they can be summed. Suggested 
> syntax is to add a boolean flag to the unique aggregation function: 
> *unique*(partitioned_field, true).
> The *true* value turns on the "partitioned" unique logic. The default is 
> false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14518) Add support for partitioned unique agg to JSON facets

2020-05-27 Thread Joel Bernstein (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17118178#comment-17118178
 ] 

Joel Bernstein edited comment on SOLR-14518 at 5/28/20, 12:28 AM:
--

There are two common approaches to co-locate records on the same shard, one is 
block join and the other is composite id routing. This would provide fast 
unique functionality for those that are using composite id routing.


was (Author: joel.bernstein):
There are two ways to co-locate records on the same shard, one is block join 
and the other is composite id routing. This would provide fast unique 
functionality for those that are using composite id routing.

> Add support for partitioned unique agg to JSON facets
> -
>
> Key: SOLR-14518
> URL: https://issues.apache.org/jira/browse/SOLR-14518
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Joel Bernstein
>Priority: Major
>
> There are scenarios where documents are partitioned across shards based on 
> the same field that the *unique* agg is applied to with JSON facets. In this 
> scenario exact unique counts can be calculated by simply sending the bucket 
> level unique counts to the aggregator where they can be summed. Suggested 
> syntax is to add a boolean flag to the unique aggregation function: 
> *unique*(partitioned_field, true).
> The *true* value turns on the "partitioned" unique logic. The default is 
> false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14518) Add support for partitioned unique agg to JSON facets

2020-05-27 Thread Joel Bernstein (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17118178#comment-17118178
 ] 

Joel Bernstein edited comment on SOLR-14518 at 5/28/20, 12:29 AM:
--

There are two frequently used approaches to co-locate records on the same 
shard, one is block join and the other is composite id routing. This would 
provide fast unique functionality on the routing key for those that are using 
composite id routing.


was (Author: joel.bernstein):
There are two common approaches to co-locate records on the same shard, one is 
block join and the other is composite id routing. This would provide fast 
unique functionality for those that are using composite id routing.

> Add support for partitioned unique agg to JSON facets
> -
>
> Key: SOLR-14518
> URL: https://issues.apache.org/jira/browse/SOLR-14518
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Joel Bernstein
>Priority: Major
>
> There are scenarios where documents are partitioned across shards based on 
> the same field that the *unique* agg is applied to with JSON facets. In this 
> scenario exact unique counts can be calculated by simply sending the bucket 
> level unique counts to the aggregator where they can be summed. Suggested 
> syntax is to add a boolean flag to the unique aggregation function: 
> *unique*(partitioned_field, true).
> The *true* value turns on the "partitioned" unique logic. The default is 
> false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14480) Fix or suppress warnings in solr/cloud/api

2020-05-27 Thread Erick Erickson (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-14480:
--
Summary: Fix or suppress warnings in solr/cloud/api  (was: Fix or suppress 
warnings in solr/api)

> Fix or suppress warnings in solr/cloud/api
> --
>
> Key: SOLR-14480
> URL: https://issues.apache.org/jira/browse/SOLR-14480
> Project: Solr
>  Issue Type: Sub-task
>  Components: Build
>Reporter: Erick Erickson
>Assignee: Atri Sharma
>Priority: Major
>
> [~atri] Here's one for you!
> Here's how I'd like to approach this:
>  * Let's start with solr/core, one subdirectory at a time.
>  * See SOLR-14474 for how we want to address auxiliary classes, especially 
> the question to move them to their own file or nest them. It'll be fuzzy 
> until we get some more experience.
>  * Let's just clean everything up _except_ deprecations. My thinking here is 
> that there will be a bunch of code changes that we can/should backport to 8x 
> to clean up the warnings. Deprecations will be (probably) 9.0 only so 
> there'll be fewer problems with maintaining the two branches if we leave 
> deprecations out of the mix for the present.
>  * Err on the side of adding @SuppressWarnings rather than code changes for 
> this phase. If it's reasonably safe to change the code (say by adding ) do 
> so, but substantive changes are too likely to have unintended consequences. 
> I'd like to reach a consensus on what changes are "safe", that'll probably be 
> an ongoing discussion as we run into them for a while.
>  * I expect there'll be a certain amount of stepping on each other's toes, no 
> doubt to clean some things up in one of the subdirectories we'll have to 
> change something in an ancestor directory, but we can deal with those as they 
> come up, probably that'll just mean merging the current master with the fork 
> we're working on...
> Let me know what you think or if you'd like to change the approach.
> Oh, and all I did here was take the second subdirectory of solr/core that I 
> found, feel free to take on something else.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14518) Add support for partitioned unique agg to JSON facets

2020-05-27 Thread Joel Bernstein (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17118178#comment-17118178
 ] 

Joel Bernstein edited comment on SOLR-14518 at 5/28/20, 12:44 AM:
--

There are two frequently used approaches to co-locate records on the same 
shard, one is block join and the other is composite id routing. This would 
provide fast distributed unique functionality on the routing key for those that 
are using composite id routing.


was (Author: joel.bernstein):
There are two frequently used approaches to co-locate records on the same 
shard, one is block join and the other is composite id routing. This would 
provide fast unique functionality on the routing key for those that are using 
composite id routing.

> Add support for partitioned unique agg to JSON facets
> -
>
> Key: SOLR-14518
> URL: https://issues.apache.org/jira/browse/SOLR-14518
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Joel Bernstein
>Priority: Major
>
> There are scenarios where documents are partitioned across shards based on 
> the same field that the *unique* agg is applied to with JSON facets. In this 
> scenario exact unique counts can be calculated by simply sending the bucket 
> level unique counts to the aggregator where they can be summed. Suggested 
> syntax is to add a boolean flag to the unique aggregation function: 
> *unique*(partitioned_field, true).
> The *true* value turns on the "partitioned" unique logic. The default is 
> false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14519) Fix or suppress warnings in solr/cloud/autoscaling/

2020-05-27 Thread Erick Erickson (Jira)
Erick Erickson created SOLR-14519:
-

 Summary: Fix or suppress warnings in solr/cloud/autoscaling/
 Key: SOLR-14519
 URL: https://issues.apache.org/jira/browse/SOLR-14519
 Project: Solr
  Issue Type: Sub-task
Reporter: Erick Erickson
Assignee: Erick Erickson






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14467) inconsistent server errors combining relatedness() with allBuckets:true

2020-05-27 Thread Chris M. Hostetter (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter updated SOLR-14467:
--
Attachment: SOLR-14467.patch
beast2.log.txt
beast.log.txt
Status: Open  (was: Open)

{quote}There's a nocommit marking an edge case problem for distributed merging. 
I think this happens for the case buckets are merged without any shard having 
initialized a bucket.
{quote}
Ah yeah ... that is an interesting edge case. (discussion of your suggested 
solution below)

Another interesting discrepancy in the output that I noticed when testing your 
patch is that in "single core" mode the "key" for {{relatedness()}} is 
completley missing from the {{allBuckets}} bucket while in distributed mode the 
key was coming back with a literal value of {{null}}. (It jumped out at me 
because i didn't understand why you had needed to add the {{notPresent(List)}} 
method to the test – and then realized it was to account for the possibility 
that either the key had no values, or the key had a {{null}} value).

I confirmed this is because of a logic discrepancy between how the per-shard 
(and single core) code paths dealing with {{SlotAcc.getValue()}} ignore result 
values of {{null}} (and don't add the key to the response) but the 
corresponding code path in {{FacetBucket.getMergedBucket()}} does not – we 
should be able to (safely) fix {{FacetBucket.getMergedBucket()}} to be 
consistent w/o impacting any other aggregations, since {{RelatednessAgg}} is 
the only one that can produce a "non-null" slot value on a shard (to return the 
fg/bg sizes for buckets that don't exist on the shard) but may still want a 
final "null" merged value (for allBuckets).

(stats like "sum" always return a value at the shard level, even if there are 
no docs in the bucket, so they always produce a merged value; while stats like 
"min" don't return anything at the shard level if no docs are in the bucket, so 
a merger is never initialized)

Which circles back to your point about merging in the situation where there are 
no buckets: we also have to consider the "how allBuckets behaves when there are 
no buckets" in the _non_ cloud case, where no merging happens – based on 
testing, some (not sure if all?) processors evidently do't do any collection on 
the the {{allBuckets}} slot at all in this situation – meaning that with the 
{{implied}} concept you introduced for the " {{// since we haven't been told 
about any docs for this slot, use a slot w/no counts}} " code path of 
{{getValue(int)}}, we have to ensure that (in the non-shard situation) a bucket 
that is implied externalizes as {{null}}.

So here's an updated patch with some additional improvements/fixes...
 * rolled the test patch and the code patch into one
 * beefed up javadocs for the SlotContext API methods
 * tweaked ref-guide wording to make it clear the {{relatedness()}} key won't 
be in the {{allBuckets}} bucket at all
 * tweaked TestCloudJSONFacetSKG to require that the key must be explicitly 
missing from allBuckets
 * added explicit testing of both basic allBuckets usage as well as the 
"allBuckets when there are no regular buckets" situation to to TestJsonFacets & 
TestJsonFacetRefinement
 * fixed {{FacetBucket.getMergedBucket()}} to ignore null (merged) values from 
stats
 * fixed {{BucketData.externalize()}} to consider {{implied}} flag for the 
non-shard case and return null
 * added some nocommits with reminders to revisit & consider if/how {{implied}} 
should affect plumbing methods like hashCode, equals, compareTo, etc...
 * incorporate your suggested change to {{BucketData.getMergedResult()}} to 
assume that entirely implied (merged data must be allBuckets

...back to your idea: i think you're correct, and this change should be safe. 
I'm trying to beast it now and wnat to think about it more ... BUT ... I think 
even if/once we confirm this is a "correct" fix, there might be a simpler 
solution?

Since we definitely need the {{implied}} concept to deal with the "never got a 
SlotContext so we don't know if it's the allBucket" situation, we may not need 
the special {{ALL_BUCKETS_SPECIAL_BUCKET_DATA}} object – I need to think it 
through a little more, but the gist of waht i have in mind is...
 * make {{implied}} non final
 * set {{implied=false}} anytime we {{incCounts()}}
 * if we know a bucket is {{allBuckets}} then never call {{incCounts()}} on it 
(already true)
 * if we are merging in a bucket from a shard that said it was {{implied:true}} 
then make sure we don't call {{incCounts()}} when merging it
 ** we should even be able to prevent the counts from being returned for an 
implied shard?
 ** (and assert that the counts are 0 when externalizing both {{isShard}} and 
{{implied}} ?)
 * for non-shard requests, externalizing an {{implied}} should always just be 
{{null}}

I think that would let us remove a lot of the s

[jira] [Created] (SOLR-14520) json.facets: allBucket:true can cause server errors when combined with refine:true

2020-05-27 Thread Chris M. Hostetter (Jira)
Chris M. Hostetter created SOLR-14520:
-

 Summary: json.facets: allBucket:true can cause server errors when 
combined with refine:true
 Key: SOLR-14520
 URL: https://issues.apache.org/jira/browse/SOLR-14520
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: Facet Module
Reporter: Chris M. Hostetter


Another bug that was discovered while testing SOLR-14467...
In some situations, using {{allBuckets:true}} in conjunction with 
{{refine:true}} can cause server errors during the "refinement" requests to the 
individual shards -- either NullPointerExceptions from some (nested) SlotAccs 
when SpecialSlotAcc tries to collect them, or ArrayIndexOutOfBoundsException 
from CountSlotArrAcc.incrementCount because it's asked to collect to "large" 
slot# values even though it's been initialized with a size of '1'

NOTE: these problems may be specific to FacetFieldProcessorByArrayDV - i have 
not yet seen similar failures from FacetFieldProcessorByArrayUIF (those are the 
only 2 used when doing refinement) but that may just be a fluke of testing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14520) json.facets: allBucket:true can cause server errors when combined with refine:true

2020-05-27 Thread Chris M. Hostetter (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter updated SOLR-14520:
--
Attachment: SOLR-14520.patch
Status: Open  (was: Open)

attaching a patch file with small changes to TestJsonFacetRefinement that 
easily demonstrates one of these types of problems...
{noformat}
   [junit4]   2> 8945 ERROR (qtp561920109-35) [x:collection1 ] 
o.a.s.h.RequestHandlerBase 
org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException: Error 
from server at null: java.lang.NullPointerException
   [junit4]   2>at 
org.apache.solr.search.facet.SlotAcc$SumSlotAcc.collect(SlotAcc.java:401)
   [junit4]   2>at 
org.apache.solr.search.facet.FacetFieldProcessor$SpecialSlotAcc.collect(FacetFieldProcessor.java:778)
   [junit4]   2>at 
org.apache.solr.search.facet.FacetFieldProcessorByArrayDV.collect(FacetFieldProcessorByArrayDV.java:335)
   [junit4]   2>at 
org.apache.solr.search.facet.FacetFieldProcessorByArrayDV.collectDocs(FacetFieldProcessorByArrayDV.java:266)
   [junit4]   2>at 
org.apache.solr.search.facet.FacetFieldProcessorByArrayDV.collectDocs(FacetFieldProcessorByArrayDV.java:161)
   [junit4]   2>at 
org.apache.solr.search.facet.FacetFieldProcessorByArray.calcFacets(FacetFieldProcessorByArray.java:90)
   [junit4]   2>at 
org.apache.solr.search.facet.FacetFieldProcessorByArray.process(FacetFieldProcessorByArray.java:62)
   [junit4]   2>at 
org.apache.solr.search.facet.FacetRequest.process(FacetRequest.java:414)
   [junit4]   2>at 
org.apache.solr.search.facet.FacetProcessor.processSubs(FacetProcessor.java:478)
   [junit4]   2>at 
org.apache.solr.search.facet.FacetProcessor.fillBucket(FacetProcessor.java:434)
   [junit4]   2>at 
org.apache.solr.search.facet.FacetFieldProcessor.refineBucket(FacetFieldProcessor.java:922)
   [junit4]   2>at 
org.apache.solr.search.facet.FacetFieldProcessor.refineFacets(FacetFieldProcessor.java:887)
   [junit4]   2>at 
org.apache.solr.search.facet.FacetFieldProcessorByArray.calcFacets(FacetFieldProcessorByArray.java:70)
   [junit4]   2>at 
org.apache.solr.search.facet.FacetFieldProcessorByArray.process(FacetFieldProcessorByArray.java:62)
   [junit4]   2>at 
org.apache.solr.search.facet.FacetRequest.process(FacetRequest.java:414)
   [junit4]   2>at 
org.apache.solr.search.facet.FacetProcessor.processSubs(FacetProcessor.java:478)
   [junit4]   2>at 
org.apache.solr.search.facet.FacetProcessor.fillBucket(FacetProcessor.java:434)
   [junit4]   2>at 
org.apache.solr.search.facet.FacetQueryProcessor.process(FacetQuery.java:65)
   [junit4]   2>at 
org.apache.solr.search.facet.FacetRequest.process(FacetRequest.java:414)
   [junit4]   2>at 
org.apache.solr.search.facet.FacetModule.process(FacetModule.java:150)
   [junit4]   2>at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:329)
   [junit4]   2>at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:209)
   [junit4]   2>at 
org.apache.solr.core.SolrCore.execute(SolrCore.java:2591)
   [junit4]   2>at 
org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:803)
   [junit4]   2>at 
org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:582)
   [junit4]   2>at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:415)
   [junit4]   2>at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
   [junit4]   2>at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1604)
   [junit4]   2>at 
org.apache.solr.client.solrj.embedded.JettySolrRunner$DebugFilter.doFilter(JettySolrRunner.java:166)
   [junit4]   2>at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1604)
   [junit4]   2>at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:545)
   [junit4]   2>at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
   [junit4]   2>at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1610)
   [junit4]   2>at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
   [junit4]   2>at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1300)
   [junit4]   2>at 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
   [junit4]   2>at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:485)
   [junit4]   2>at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1580)
   [junit4]   2>at 
org.eclipse.jetty.server.handler.Scop

[jira] [Commented] (SOLR-14498) BlockCache gets stuck not accepting new stores

2020-05-27 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17118236#comment-17118236
 ] 

ASF subversion and git services commented on SOLR-14498:


Commit 84c5dfc277d25735b17b1493daa98693286f4aed in lucene-solr's branch 
refs/heads/master from Erick Erickson
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=84c5dfc ]

SOLR-14498: BlockCache gets stuck not accepting new stores fixing checksums


> BlockCache gets stuck not accepting new stores
> --
>
> Key: SOLR-14498
> URL: https://issues.apache.org/jira/browse/SOLR-14498
> Project: Solr
>  Issue Type: Bug
>  Components: query
>Affects Versions: 6.5, 6.6.5, master (9.0), 7.7.3, 8.5.1
>Reporter: Jakub Zytka
>Assignee: Andrzej Bialecki
>Priority: Major
> Fix For: 8.6
>
>
> {{BlockCache}} uses two components: "storage", i.e. {{banks}} and "eviction 
> mechanism", i.e {{cache}}, implemented by caffeine cache.
> The relation between them is that "storage" enforces a strict limit for the 
> number of entries (
> {{numberOfBlocksPerBank * numberOfBanks}}) whereas the "eviction mechanism" 
> takes care of freeing entries from the storage thanks to {{maximumSize}} set 
> for the caffeine cache to {{numberOfBlocksPerBank * numberOfBanks - 1}}.
> The storage relies on caffeine cache to eventually free at least 1 entry from 
> the storage. If that doesn't happen the {{BlockCache}} starts to fail all new 
> stores.
> As it turns out, the caffeine cache may not reduce it's size to the desired 
> {{maximumSize}} for as long as no {{put}} or {{getIfPresent}} which *finds an 
> entry* is executed.
> With a sufficiently unlucky read pattern, the block cache may be rendered 
> useless (0 hit ratio):
> cache poisoned by non-reusable entries; new, reusable entries are not stored 
> and thus not reused.
> Further info may be found in 
> [https://github.com/ben-manes/caffeine/issues/420]
>  
> Change in caffeine that triggers it's internal cleanup mechanism regardless 
> of whether getIfPresent gets a hit has been implemented in 
> [https://github.com/ben-manes/caffeine/commit/7239bb0dda2af1e7301e8f66a5df28215b5173bc]
> and is due to be released in caffeine 2.8.4



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9385) Skip indexing facet drill down terms

2020-05-27 Thread Ankur (Jira)
Ankur created LUCENE-9385:
-

 Summary: Skip indexing facet drill down terms
 Key: LUCENE-9385
 URL: https://issues.apache.org/jira/browse/LUCENE-9385
 Project: Lucene - Core
  Issue Type: New Feature
  Components: modules/facet
Affects Versions: 8.5.2
Reporter: Ankur


FacetsConfig creates index terms from the Facet dimension and path 
automatically for the purpose of supporting drill-down queries.

An application that does not need drill-down ends up paying the index cost of 
the extra terms.

Ideally an option to skip indexing these drill down terms should be exposed to 
the application.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org