[jira] [Updated] (LUCENE-9609) When the term of more than 16, highlight the query does not return
[ https://issues.apache.org/jira/browse/LUCENE-9609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangFeiCheng updated LUCENE-9609: - Description: I noticed that when there are too many terms, the highlighted query is restricted I know that in TermInSetQuery, when there are fewer terms, BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16 will be used to improve query efficiency {code:java} static final int BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16; public Query rewrite(IndexReader reader) throws IOException { final int threshold = Math.min(BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD, BooleanQuery.getMaxClauseCount()); if (termData.size() <= threshold) { BooleanQuery.Builder bq = new BooleanQuery.Builder(); TermIterator iterator = termData.iterator(); for (BytesRef term = iterator.next(); term != null; term = iterator.next()) { bq.add(new TermQuery(new Term(iterator.field(), BytesRef.deepCopyOf(term))), Occur.SHOULD); } return new ConstantScoreQuery(bq.build()); } return super.rewrite(reader); } {code} When the term of the query statement exceeds 16, the createWeight method in TermInSetQuery will be used {code:java} public Weight createWeight(IndexSearcher searcher, boolean needsScores, float boost) throws IOException { return new ConstantScoreWeight(this, boost) { @Override public void extractTerms(Set terms) { // no-op // This query is for abuse cases when the number of terms is too high to // run efficiently as a BooleanQuery. So likewise we hide its terms in // order to protect highlighters } .. } {code} I want to ask, why do you say "we hide its terms in order to protect highlighters" How to implement such " protect highlighters"? was: I noticed that when there are too many terms, the highlighted query is restricted I know that in TermInSetQuery, when there are fewer terms, BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16 will be used to improve query efficiency {code:java} static final int BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16; public Query rewrite(IndexReader reader) throws IOException { final int threshold = Math.min(BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD, BooleanQuery.getMaxClauseCount()); if (termData.size() <= threshold) { BooleanQuery.Builder bq = new BooleanQuery.Builder(); TermIterator iterator = termData.iterator(); for (BytesRef term = iterator.next(); term != null; term = iterator.next()) { bq.add(new TermQuery(new Term(iterator.field(), BytesRef.deepCopyOf(term))), Occur.SHOULD); } return new ConstantScoreQuery(bq.build()); } return super.rewrite(reader); } {code} When the term of the query statement exceeds 16, the createWeight method in TermInSetQuery will be used {code:java} public Weight createWeight(IndexSearcher searcher, boolean needsScores, float boost) throws IOException { return new ConstantScoreWeight(this, boost) { @Override public void extractTerms(Set terms) { // no-op // This query is for abuse cases when the number of terms is too high to // run efficiently as a BooleanQuery. So likewise we hide its terms in // order to protect highlighters } .. } {code} I want to ask, why do you say "we hide its terms in order to protect highlighters" Why this threshold can highlight protection, or how to implement such " protect highlighters"? > When the term of more than 16, highlight the query does not return > -- > > Key: LUCENE-9609 > URL: https://issues.apache.org/jira/browse/LUCENE-9609 > Project: Lucene - Core > Issue Type: Wish > Components: core/search >Affects Versions: 7.7.3 >Reporter: WangFeiCheng >Priority: Minor > > I noticed that when there are too many terms, the highlighted query is > restricted > I know that in TermInSetQuery, when there are fewer terms, > BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16 will be used to improve query > efficiency > {code:java} > static final int BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD = 16; > public Query rewrite(IndexReader reader) throws IOException { > final int threshold = Math.min(BOOLEAN_REWRITE_TERM_COUNT_THRESHOLD, > BooleanQuery.getMaxClauseCount()); > if (termData.size() <= threshold) { > BooleanQuery.Builder bq = new BooleanQuery.Builder(); > TermIterator iterator = termData.iterator(); > for (BytesRef term = iterator.next(); term != null; term = > iterator.next()) { > bq.add(new TermQuery(new Term(iterator.field(), > BytesRef.deepCopyOf(term))), Occur.SHOULD); > } > return new ConstantScoreQuery(bq.build()); > } > return super.rewrite(reader); > } > {code} > When the term of the query state
[GitHub] [lucene-solr] dweiss commented on a change in pull request #2068: LUCENE-8982: Separate out native code to another module to allow cpp build with gradle
dweiss commented on a change in pull request #2068: URL: https://github.com/apache/lucene-solr/pull/2068#discussion_r523406193 ## File path: gradle/native/disable-native.gradle ## @@ -17,20 +17,65 @@ // This is the master switch to disable all tasks that compile // native (cpp) code. -def buildNative = propertyOrDefault("build.native", true).toBoolean() +rootProject.ext { + buildNative = propertyOrDefault("build.native", true).toBoolean() +} + +// Explicitly list all projects that should be configured for native extensions. +// We could scan for projects with a the cpp-library plugin but this is faster. +def nativeProjects = allprojects.findAll {it.path in [ +":lucene:misc:native" +]} + +def javaProjectsWithNativeDeps = allprojects.findAll {it.path in [ +":lucene:misc" +]} + +// Set up defaults for projects with native dependencies. +configure(javaProjectsWithNativeDeps, { + configurations { Review comment: Right. This is gradle deep-waters but not too complicated once you get the hang of it. I am personally fond of simple and straightforward code, coming from assembly myself, but these days it's hard to avoid higher-level constructs. Sorry about this! So, the code you asked about essentially applies ('configure') a code block (secont argument to configure - the closure) to all objects that are in the collection that is the first argument to configure. When you look at that patch we create two such collections - nativeProjects and javaProjectsWithNativeDeps, each one with Project instances that match a corresponding path. This is very much as if you copy-pasted the same code in each gradle.build of each corresponding project - avoids repetition, centralises common configuration. The inside of this closure block is where things get more interesting and more gradle-sque. The "configurations" block is a project property that declares the project's configurations to which artifacts (or dependencies) can be added. You should understand how this works because this is the core of how it works. See here, for example (gradle documentation is very good, if sometimes scattered across multiple documents): https://docs.gradle.org/current/userguide/dependency_management.html#declaring-dependencies So, for any project in javaProjectsWithNativeDeps we declare a configuration called "nativeDeps". To this configuration we will attach a dependency to a "native" project - our cpp-library modules. In case of :lucene:misc we attach it inside build.gradle: ``` nativeDeps project(":lucene:misc:native") ``` Then the code declares two project "extra (ext) properties - testOptions and nativeDepsDir. The testOptions is a an array we use to collect various test attributes from bits and pieces of configuration scattered around other files. To make the evaluation work in the right order (ensure the array is there) we put everything in a deferred-execution block (this is why plugins.withType(JavaPlugin) stuff is needed...). Finally, we apply a code closure configuring any of the project's tasks that have the type "Test", adding java.library.path system property pointing at the nativeDepsDir property location and a dependency on a task that will copy all dependencies (artifacts produced by dependencies attached to the nativeDeps configuration) to that folder. These two steps are only needed if we indeed plan to run with native libraries so they're wrapped in an 'if'. > and the copyNativeDeps task below copies only the needed library artifact without all the nested folder structure It actually synchronizes the state of the target folder with source files. Note that the "source" files are pointed to by the configuration (artifacts produced by this configuration dependencies). The cpp-library produces *multiple* configuration variants (debug, release - with possibly multiple artifacts). This is where it gets tricky and this is why we declared the nativeDeps configuration with some "attributes" that allow gradle to pick just one of the produced configurations out of multiple choices. In most Java projects you'd never have to use this but you combined cpp and java. Try removing those attributes from configuration definition and see what happens. Then see this: https://docs.gradle.org/current/userguide/variant_model.html Sorry for throwing so much on you but you asked for it! :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.or
[GitHub] [lucene-solr] dweiss commented on a change in pull request #2068: LUCENE-8982: Separate out native code to another module to allow cpp build with gradle
dweiss commented on a change in pull request #2068: URL: https://github.com/apache/lucene-solr/pull/2068#discussion_r523406193 ## File path: gradle/native/disable-native.gradle ## @@ -17,20 +17,65 @@ // This is the master switch to disable all tasks that compile // native (cpp) code. -def buildNative = propertyOrDefault("build.native", true).toBoolean() +rootProject.ext { + buildNative = propertyOrDefault("build.native", true).toBoolean() +} + +// Explicitly list all projects that should be configured for native extensions. +// We could scan for projects with a the cpp-library plugin but this is faster. +def nativeProjects = allprojects.findAll {it.path in [ +":lucene:misc:native" +]} + +def javaProjectsWithNativeDeps = allprojects.findAll {it.path in [ +":lucene:misc" +]} + +// Set up defaults for projects with native dependencies. +configure(javaProjectsWithNativeDeps, { + configurations { Review comment: Right. This is gradle deep-waters but not too complicated once you get the hang of it. I am personally font of simple and straightforward code, coming from assembly, but these days it's hard to avoid higher-level constructs. Sorry about this! So, the code you asked about essentially applies ('configure') a code block (secont argument to configure - the closure) to all objects that are in the collection that is the first argument to configure. When you look at that patch we create two such collections - nativeProjects and javaProjectsWithNativeDeps, each one with Project instances that match a corresponding path. This is very much as if you copy-pasted the same code in each gradle.build of each corresponding project - avoids repetition, centralises common configuration. The inside of this closure block is where things get more interesting and more gradle-sque. The "configurations" block is a project property that declares the project's configurations to which artifacts (or dependencies) can be added. You should understand how this works because this is the core of how it works. See here, for example (gradle documentation is very good, if sometimes scattered across multiple documents): https://docs.gradle.org/current/userguide/dependency_management.html#declaring-dependencies So, for any project in javaProjectsWithNativeDeps we declare a configuration called "nativeDeps". To this configuration we will attach a dependency to a "native" project - our cpp-library modules. In case of :lucene:misc we attach it inside build.gradle: ``` nativeDeps project(":lucene:misc:native") ``` Then the code declares two project "extra (ext) properties - testOptions and nativeDepsDir. The testOptions is a an array we use to collect various test attributes from bits and pieces of configuration scattered around other files. To make the evaluation work in the right order (ensure the array is there) we put everything in a deferred-execution block (this is why plugins.withType(JavaPlugin) stuff is needed...). Finally, we apply a code closure configuring any of the project's tasks that have the type "Test", adding java.library.path system property pointing at the nativeDepsDir property location and a dependency on a task that will copy all dependencies (artifacts produced by dependencies attached to the nativeDeps configuration) to that folder. These two steps are only needed if we indeed plan to run with native libraries so they're wrapped in an 'if'. > and the copyNativeDeps task below copies only the needed library artifact without all the nested folder structure It actually synchronizes the state of the target folder with source files. Note that the "source" files are pointed to by the configuration (artifacts produced by this configuration dependencies). The cpp-library produces *multiple* configuration variants (debug, release - with possibly multiple artifacts). This is where it gets tricky and this is why we declared the nativeDeps configuration with some "attributes" that allow gradle to pick just one of the produced configurations out of multiple choices. In most Java projects you'd never have to use this but you combined cpp and java. Try removing those attributes from configuration definition and see what happens. Then see this: https://docs.gradle.org/current/userguide/variant_model.html Sorry for throwing so much on you but you asked for it! :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #2068: LUCENE-8982: Separate out native code to another module to allow cpp build with gradle
dweiss commented on a change in pull request #2068: URL: https://github.com/apache/lucene-solr/pull/2068#discussion_r523406193 ## File path: gradle/native/disable-native.gradle ## @@ -17,20 +17,65 @@ // This is the master switch to disable all tasks that compile // native (cpp) code. -def buildNative = propertyOrDefault("build.native", true).toBoolean() +rootProject.ext { + buildNative = propertyOrDefault("build.native", true).toBoolean() +} + +// Explicitly list all projects that should be configured for native extensions. +// We could scan for projects with a the cpp-library plugin but this is faster. +def nativeProjects = allprojects.findAll {it.path in [ +":lucene:misc:native" +]} + +def javaProjectsWithNativeDeps = allprojects.findAll {it.path in [ +":lucene:misc" +]} + +// Set up defaults for projects with native dependencies. +configure(javaProjectsWithNativeDeps, { + configurations { Review comment: Right. This is gradle deep-waters but not too complicated once you get the hang of it. I am personally font of simple and straightforward code, coming from assembly, but these days it's hard to avoid higher-level constructs. Sorry about this! So, the code you asked about essentially applies ('configure') a code block (secont argument to configure - the closure) to all objects that are in the collection that is the first argument to configure. When you look at that patch we create two such collections - nativeProjects and javaProjectsWithNativeDeps, each one with Project instances that match a corresponding path. This is very much as if you copy-pasted the same code in each gradle.build of each corresponding project - avoids repetition, centralises common configuration. The inside of this closure block is where things get more interesting and more gradle-sque. The "configurations" block is a project property that declares the project's configurations to which artifacts (or dependencies) can be added. You should understand how this works because this is the core of how it works. See here, for example (gradle documentation is very good, if sometimes scattered across multiple documents): https://docs.gradle.org/current/userguide/dependency_management.html#declaring-dependencies So, for any project in javaProjectsWithNativeDeps we declare a configuration called "nativeDeps". To this configuration we will attach a dependency to a "native" project - our cpp-library modules. In case of :lucene:misc we attach it inside build.gradle: ``` nativeDeps project(":lucene:misc:native") ``` Then the code declares two project "extra (ext) properties - testOptions and nativeDepsDir. The testOptions is a an array we use to collect various test attributes from bits and pieces of configuration scattered around other files. To make the evaluation work in the right order (ensure the array is there) we put everything in a deferred-execution block (this is why plugins.withType(JavaPlugin) stuff is needed...). Finally, we apply a code closure configuring any of the project's tasks that have the type "Test", adding java.library.path system property pointing at the nativeDepsDir property location and a dependency on a task that will copy all dependencies (artifacts produced by dependencies attached to the nativeDeps configuration) to that folder. These two steps are only needed if we indeed plan to run with native libraries so they're wrapped in an 'if'. > and the copyNativeDeps task below copies only the needed library artifact without all the nested folder structure It actually synchronizes the state of the target folder with source files. Note that the "source" files are pointed to by the configuration (artifacts produced by this configuration dependencies). The cpp-library produces *multiple* configuration variants (debug, release - with possibly multiple artifacts). This is where it gets tricky and this is why we declared the nativeDeps configuration with some "attributes" that allow gradle to pick just one of the produced configurations out of multiple choices. In most Java projects you'd never have to use this but you combined cpp and java. Try removing those attributes from configuration definition and see what happens. Then see this: https://docs.gradle.org/current/userguide/variant_model.html Sorry for throwing so much on you but you asked! :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9607) TestUniformSplitPostingFormat.testCheckIntegrityReadsAllBytes test failure
[ https://issues.apache.org/jira/browse/LUCENE-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232001#comment-17232001 ] Michael Sokolov commented on LUCENE-9607: - This always seems to be associated with a build of {{Lucene-Solr-cloud2refimpl-Linux}}. Maybe there is a fix that never made it to that branch? It does point to the risk of drift when we maintain these long-running branches. I don't know if there is any effort being made to track trunk there? > TestUniformSplitPostingFormat.testCheckIntegrityReadsAllBytes test failure > -- > > Key: LUCENE-9607 > URL: https://issues.apache.org/jira/browse/LUCENE-9607 > Project: Lucene - Core > Issue Type: Bug >Reporter: Michael McCandless >Priority: Major > > CI builds have been failing with this: > {noformat} > FAILED: > org.apache.lucene.codecs.uniformsplit.TestUniformSplitPostingFormat.testCheckIntegrityReadsAllBytes > Error Message: > java.lang.AssertionError > Stack Trace: > java.lang.AssertionError > at > __randomizedtesting.SeedInfo.seed([43D1E1D1DB325AD7:3E13F00D7ACC8E7E]:0) > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.lucene.codecs.uniformsplit.TestUniformSplitPostingFormat.checkEncodingCalled(TestUniformSplitPostingFormat.java:63) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:564) > at > com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1754) > at > com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:1000) > at > org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) > at > org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) > at > org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) > at > org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) > at > org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) > at > com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) > at > com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:370) > at > com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:819) > at > com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:470) > at > com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:951) > at > com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:836) > at > com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:887) > at > com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:898) > at > org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) > at > com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) > at > org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41) > at > com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) > at > com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) > at > com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) > at > org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) > at > org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) > at > org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) > at > org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54) > at > com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) > at > com.carrotsearch.randomizedtesting.ThreadLeakC
[jira] [Created] (LUCENE-9610) Fix failures of TestKnnGraph.testMerge
Michael Sokolov created LUCENE-9610: --- Summary: Fix failures of TestKnnGraph.testMerge Key: LUCENE-9610 URL: https://issues.apache.org/jira/browse/LUCENE-9610 Project: Lucene - Core Issue Type: Bug Reporter: Michael Sokolov I saw three failures like those below rep[orted to the mailing list last night. The seeds do not reproduce for me, but there's clearly something up FAILED: org.apache.lucene.index.TestKnnGraph.testMerge Error Message: java.lang.AssertionError: Attempted to walk entire graph but only visited 255 expected:<257> but was:<255> FAILED: org.apache.lucene.index.TestKnnGraph.testMerge Error Message: java.lang.AssertionError: Attempted to walk entire graph but only visited 104 expected:<105> but was:<104> Stack Trace: java.lang.AssertionError: Attempted to walk entire graph but only visited 104 expected:<105> but was:<104> at __randomizedtesting.SeedInfo.seed([F1E895FE123F786D:42494B1A1DE4CEF8]:0) = FAILED: org.apache.lucene.index.TestKnnGraph.testMerge Error Message: java.lang.AssertionError: Attempted to walk entire graph but only visited 104 expected:<105> but was:<104> Stack Trace: java.lang.AssertionError: Attempted to walk entire graph but only visited 104 expected:<105> but was:<104> at __randomizedtesting.SeedInfo.seed([F1E895FE123F786D:42494B1A1DE4CEF8]:0) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9610) Fix failures of TestKnnGraph.testMerge
[ https://issues.apache.org/jira/browse/LUCENE-9610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232007#comment-17232007 ] Michael Sokolov commented on LUCENE-9610: - Ah, but does repro with the full command: gradlew test --tests TestKnnGraph.testMerge -Dtests.seed=25D36CF27DC4CAED -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene/Lucene-Solr-NightlyTests-master/test-data/enwiki.random.lines.txt -Dtests.locale=fr-DZ -Dtests.timezone=Pacific/Yap -Dtests.asserts=true -Dtests.file.encoding=UTF-8 > Fix failures of TestKnnGraph.testMerge > -- > > Key: LUCENE-9610 > URL: https://issues.apache.org/jira/browse/LUCENE-9610 > Project: Lucene - Core > Issue Type: Bug >Reporter: Michael Sokolov >Priority: Major > > I saw three failures like those below rep[orted to the mailing list last > night. The seeds do not reproduce for me, but there's clearly something up > FAILED: org.apache.lucene.index.TestKnnGraph.testMerge > Error Message: > java.lang.AssertionError: Attempted to walk entire graph but only visited 255 > expected:<257> but was:<255> > > FAILED: org.apache.lucene.index.TestKnnGraph.testMerge > Error Message: > java.lang.AssertionError: Attempted to walk entire graph but only visited 104 > expected:<105> but was:<104> > Stack Trace: > java.lang.AssertionError: Attempted to walk entire graph but only visited 104 > expected:<105> but was:<104> > at > __randomizedtesting.SeedInfo.seed([F1E895FE123F786D:42494B1A1DE4CEF8]:0) > = > FAILED: org.apache.lucene.index.TestKnnGraph.testMerge > Error Message: > java.lang.AssertionError: Attempted to walk entire graph but only visited 104 > expected:<105> but was:<104> > Stack Trace: > java.lang.AssertionError: Attempted to walk entire graph but only visited 104 > expected:<105> but was:<104> > at > __randomizedtesting.SeedInfo.seed([F1E895FE123F786D:42494B1A1DE4CEF8]:0) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9607) TestUniformSplitPostingFormat.testCheckIntegrityReadsAllBytes test failure
[ https://issues.apache.org/jira/browse/LUCENE-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232021#comment-17232021 ] Erick Erickson commented on LUCENE-9607: I have my mail filters set to put all the reference impl failures in a separate mailbox, that's the only place I recall seeing this, although that's just going from my memory. There are a series of other failures I see regularly, my guess is that there are other issues that are a higher priority at present. Also, IIUC, there's reference_impl_dev and reference_impl, so some fixes may already be done in dev > TestUniformSplitPostingFormat.testCheckIntegrityReadsAllBytes test failure > -- > > Key: LUCENE-9607 > URL: https://issues.apache.org/jira/browse/LUCENE-9607 > Project: Lucene - Core > Issue Type: Bug >Reporter: Michael McCandless >Priority: Major > > CI builds have been failing with this: > {noformat} > FAILED: > org.apache.lucene.codecs.uniformsplit.TestUniformSplitPostingFormat.testCheckIntegrityReadsAllBytes > Error Message: > java.lang.AssertionError > Stack Trace: > java.lang.AssertionError > at > __randomizedtesting.SeedInfo.seed([43D1E1D1DB325AD7:3E13F00D7ACC8E7E]:0) > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.lucene.codecs.uniformsplit.TestUniformSplitPostingFormat.checkEncodingCalled(TestUniformSplitPostingFormat.java:63) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:564) > at > com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1754) > at > com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:1000) > at > org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) > at > org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) > at > org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) > at > org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) > at > org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) > at > com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) > at > com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:370) > at > com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:819) > at > com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:470) > at > com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:951) > at > com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:836) > at > com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:887) > at > com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:898) > at > org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) > at > com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) > at > org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41) > at > com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) > at > com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) > at > com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) > at > org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) > at > org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) > at > org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) > at > org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54) > at > com.carrotsearch.randomizedtesting
[jira] [Resolved] (SOLR-14991) tag and remove obsolete branches
[ https://issues.apache.org/jira/browse/SOLR-14991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson resolved SOLR-14991. --- Fix Version/s: 8.8 Resolution: Fixed These branches have been tagged/removed remotes/origin/jira-14151-revert remotes/origin/jira/V2Request remotes/origin/master-deprecations remotes/origin/revert-776-remove_icu_dependency remotes/origin/jira/http2 remotes/origin/SOLR-11795 remotes/origin/jira/LUCENE-8738 remotes/origin/jira/LUCENE-9312 remotes/origin/jira/LUCENE-9438 remotes/origin/jira/SOLR-13229 remotes/origin/jira/SOLR-13462 remotes/origin/jira/SOLR-13661-reverted remotes/origin/jira/SOLR-13661_2 remotes/origin/jira/SOLR-13793 remotes/origin/jira/SOLR-13822_backup remotes/origin/jira/SOLR-13834 remotes/origin/jira/SOLR-14354-revert remotes/origin/jira/SOLR-14383 remotes/origin/jira/lucene-5438-nrt-replication remotes/origin/jira/solr-12259 remotes/origin/jira/solr-13472 remotes/origin/jira/solr-13619 remotes/origin/jira/solr-13662 remotes/origin/jira/solr-13662-2 remotes/origin/jira/solr-13662-3 remotes/origin/jira/solr-13662-3_tmp remotes/origin/jira/solr-13662-fixes remotes/origin/jira/solr-13662-updated remotes/origin/jira/solr-13718-8x remotes/origin/jira/solr-13971 remotes/origin/jira/solr-13978 remotes/origin/jira/solr-14021 remotes/origin/jira/solr-14022 remotes/origin/jira/solr-14025 remotes/origin/jira/solr-14066-master remotes/origin/jira/solr-14071 remotes/origin/jira/solr-14151-revert remotes/origin/jira/solr-14151-revert-2 remotes/origin/jira/solr-14151-revert-8x remotes/origin/jira/solr-14151-revert-8x-2 remotes/origin/jira/solr-14158 remotes/origin/jira/solr-14599 remotes/origin/jira/solr-14599_1 remotes/origin/jira/solr-14603 remotes/origin/jira/solr-14603-8x remotes/origin/jira/solr-14616 remotes/origin/jira/solr-14799 remotes/origin/jira/solr-14914 remotes/origin/jira/solr14398 remotes/origin/jira/solr14404_fix remotes/origin/lucene-6835 remotes/origin/lucene-6997 remotes/origin/lucene-7015 remotes/origin/solr-13131 remotes/origin/solr-6733 remotes/origin/starburst > tag and remove obsolete branches > > > Key: SOLR-14991 > URL: https://issues.apache.org/jira/browse/SOLR-14991 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Fix For: 8.8 > > > I'm going to gradually work through the branches, tagging and removing > 1> anything with a Jira name that's fixed > 2> anything that I'm certain will never be fixed (e.g. the various gradle > build branches) > So the changes will still available, they just won't pollute the branch list. > I'll list the branches here, all the tags will be > history/branches/lucene-solr/ > > This specifically will _not_ include > 1> any release, e.g. branch_8_4 > 2> anything I'm unsure about. People who've created branches should expect > some pings about this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-15000) Solr based enterprise level, one-stop search center products with high performance, high reliability and high scalability
[ https://issues.apache.org/jira/browse/SOLR-15000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232023#comment-17232023 ] David Eric Pugh commented on SOLR-15000: Congrats on opening a JIRA ticket with a great number, 15000! I wanted to share a couple of thoughts, because I suspect that you are right that TIS, as it stands, wouldn't make sense as part of Apache Solr. 1) As far as the Incubator goes, go through the http://incubator.apache.org/cookbook/ and evaluate if it's a process worth embracing. It's not an insignificant, and thats intentional. ASF wants to grow and build new communities, not become a repository for abandoned projects. Community building is hard!I suspect, that if you can build a basic community around TIS outside of the ASF, you will find it easier to join the Incubator program. 2) As a follow on to #1, think about how to raise the profile of TIS. Do you have regular updates and public releases? Do you have active contributors from multiple organizations? Is there evangelism you can do with TIS to get there? For example, make sure that TIS has nice docs and website, and get it added to https://github.com/frutik/awesome-search so people see it. Start tweeting about it ;-). A model to look at is Fess (https://github.com/codelibs/fess), which is somewhat similar to TIS ;-). Also, check out what I'm doing with Chorus, a ecomm focused stack based on Solr: https://github.com/querqy/chorus. To build community there, we're starting a series of workshops to teach people how to use Chorus, get folks excited about it, and build some momentum. https://plainschwarz.com/ps-salon/. Obviously, the Chorus effort is going to be a multiyear effort that consumes a lot of my energy, and that's probably what you should plan on for TIS if you want to go this route! 3) An alternative path might be to take all of the great parts of TIS, and maybe contribute to existing projects. For example, your schema editing stuff sounds cool, maybe you could partner up with https://github.com/yasa-org/yasa project to push forward on that? Or, if you enjoy old Angular code, maybe submit PR's to existing Solr Admin to add that there ;-). 4) Another idea is that instead of rolling a full application stack, which honestly would be daunting for someone to jsut adopt wholesale, leverage all the new Package stuff in Solr to make Packages that run with Solr 8... Create some good packages, submit them to http://solr.cool, and look to get adoption that way. Hope this helps! > Solr based enterprise level, one-stop search center products with high > performance, high reliability and high scalability > - > > Key: SOLR-15000 > URL: https://issues.apache.org/jira/browse/SOLR-15000 > Project: Solr > Issue Type: Wish > Security Level: Public(Default Security Level. Issues are Public) > Components: Admin UI >Reporter: bai sui >Priority: Minor > Attachments: add-collection-step-2-expert.png, > add-collection-step-2.png > > > h2. Summary > I have developed an enterprise application based on Solr,named TIS . Use TIS > can quickly build enterprise search service for you. TIS includes three > components: > - offline index building platform > The data is exported from ER database( mysql, sqlserver and so on) through > full table scanning, and then the wide table is constructed by local MR tool, > or the wide table is constructed directly by spark > - incremental real-time channel > It is transmitted to Kafka , and real-time stream calculation is carried out > by Flink and submitted to search engine to ensure that the data in search > engine and database are consistent in near real time > - search engine > currently,based on Solr8 > TIS integrate these components seamlessly and bring users one-stop, out of > the box experience. > h2. My question > I want to feed back my code to the community, but TIS focuses on Enterprise > Application Search, just as elasitc search focuses on visual analysis of time > series data. Because Solr is a general search product, *I don't think TIS can > be merged directly into Solr. Is it possible for TIS to be a new incubation > project under Apache?* > h2. TIS main Features > - The schema and solrconfig storage are separated from ZK and stored in > MySQL. The version management function is provided. Users can roll back to > the historical version of the configuration. > !add-collection-step-2-expert.png|width=500! > !add-collection-step-2.png|width=500! >Schema editing mode can be switched between visual editing mode or > advanced expert mode > - Define wide table rules based on the selected data tabl
[GitHub] [lucene-solr] ErickErickson commented on pull request #2078: SOLR-14986: Add warning to ref guide that using properties.name is an…
ErickErickson commented on pull request #2078: URL: https://github.com/apache/lucene-solr/pull/2078#issuecomment-727210833 Thanks Houston, I incorporated your suggestions. I'll push the fix shortly. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] ErickErickson closed pull request #2078: SOLR-14986: Add warning to ref guide that using properties.name is an…
ErickErickson closed pull request #2078: URL: https://github.com/apache/lucene-solr/pull/2078 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14986) Add warning to ref guide that using "properties.name" is an expert option
[ https://issues.apache.org/jira/browse/SOLR-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232025#comment-17232025 ] ASF subversion and git services commented on SOLR-14986: Commit 93ecd0fa0a35355d3b10ef0a470c5c515032aa40 in lucene-solr's branch refs/heads/master from Erick Erickson [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=93ecd0f ] SOLR-14986: Add warning to ref guide that using 'properties.name' is an expert option > Add warning to ref guide that using "properties.name" is an expert option > - > > Key: SOLR-14986 > URL: https://issues.apache.org/jira/browse/SOLR-14986 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > This came to light when I was looking at two user-list questions where people > try to manually define core.properties to define _replicas_ in SolrCloud. > There are two related issues: > 1> You can do things like "action=CREATE&name=eoe&property.collection=blivet" > which results in an opaque error about "could not create replica." I > propose we return a better error here like "property.collection should not be > specified when creating a collection". What do people think about the rest of > the auto-created properties on collection creation? > coreNodeName > collection.configName > name > numShards > shard > collection > replicaType > "name" seems to be OK to change, although i don't see anyplace anyone can > actually see it afterwards > 2> Change the ref guide to steer people away from attempting to manually > create a core.properties file to define cores/replicas in SolrCloud. There's > no warning on the "defining-core-properties.adoc" for instance. Additionally > there should be some kind of message on the collections API documentation > about not trying to set the properties in <1> on the CREATE command. > <2> used to actually work (apparently) with legacyCloud... -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14986) Add warning to ref guide that using "properties.name" is an expert option
[ https://issues.apache.org/jira/browse/SOLR-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232026#comment-17232026 ] ASF subversion and git services commented on SOLR-14986: Commit 2fe135a14367c922a97052200096621a5681a6d1 in lucene-solr's branch refs/heads/branch_8x from Erick Erickson [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=2fe135a ] SOLR-14986: Add warning to ref guide that using 'properties.name' is an expert option > Add warning to ref guide that using "properties.name" is an expert option > - > > Key: SOLR-14986 > URL: https://issues.apache.org/jira/browse/SOLR-14986 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > This came to light when I was looking at two user-list questions where people > try to manually define core.properties to define _replicas_ in SolrCloud. > There are two related issues: > 1> You can do things like "action=CREATE&name=eoe&property.collection=blivet" > which results in an opaque error about "could not create replica." I > propose we return a better error here like "property.collection should not be > specified when creating a collection". What do people think about the rest of > the auto-created properties on collection creation? > coreNodeName > collection.configName > name > numShards > shard > collection > replicaType > "name" seems to be OK to change, although i don't see anyplace anyone can > actually see it afterwards > 2> Change the ref guide to steer people away from attempting to manually > create a core.properties file to define cores/replicas in SolrCloud. There's > no warning on the "defining-core-properties.adoc" for instance. Additionally > there should be some kind of message on the collections API documentation > about not trying to set the properties in <1> on the CREATE command. > <2> used to actually work (apparently) with legacyCloud... -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14986) Add warning to ref guide that using "properties.name" is an expert option
[ https://issues.apache.org/jira/browse/SOLR-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson resolved SOLR-14986. --- Fix Version/s: 8.8 Resolution: Fixed > Add warning to ref guide that using "properties.name" is an expert option > - > > Key: SOLR-14986 > URL: https://issues.apache.org/jira/browse/SOLR-14986 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Fix For: 8.8 > > Time Spent: 40m > Remaining Estimate: 0h > > This came to light when I was looking at two user-list questions where people > try to manually define core.properties to define _replicas_ in SolrCloud. > There are two related issues: > 1> You can do things like "action=CREATE&name=eoe&property.collection=blivet" > which results in an opaque error about "could not create replica." I > propose we return a better error here like "property.collection should not be > specified when creating a collection". What do people think about the rest of > the auto-created properties on collection creation? > coreNodeName > collection.configName > name > numShards > shard > collection > replicaType > "name" seems to be OK to change, although i don't see anyplace anyone can > actually see it afterwards > 2> Change the ref guide to steer people away from attempting to manually > create a core.properties file to define cores/replicas in SolrCloud. There's > no warning on the "defining-core-properties.adoc" for instance. Additionally > there should be some kind of message on the collections API documentation > about not trying to set the properties in <1> on the CREATE command. > <2> used to actually work (apparently) with legacyCloud... -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] rmuir commented on pull request #2080: LUCENE-8947: Skip field length accumulation when norms are disabled
rmuir commented on pull request #2080: URL: https://github.com/apache/lucene-solr/pull/2080#issuecomment-727214217 I'm concerned about this change: other things will overflow if you have too many term frequencies in a field. Currently frequency is bounded by 2^32-1 within a doc, and you can only have 2^32-1 documents in the index, so stats like `totalTermFreq` and `sumTotalTermFreq` can't overflow. But with this change it would be easy to do this and break scoring, fail checkindex, etc. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] rmuir merged pull request #2077: LUCENE-9605: update snowball to d8cf01ddf37a, adds Yiddish
rmuir merged pull request #2077: URL: https://github.com/apache/lucene-solr/pull/2077 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9605) update snowball to latest (adds Yiddish stemmer)
[ https://issues.apache.org/jira/browse/LUCENE-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-9605. - Resolution: Fixed > update snowball to latest (adds Yiddish stemmer) > > > Key: LUCENE-9605 > URL: https://issues.apache.org/jira/browse/LUCENE-9605 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Robert Muir >Priority: Major > Fix For: master (9.0) > > Time Spent: 1.5h > Remaining Estimate: 0h > > I'm trying to find time to upstream our snowball diffs... it helps to be > reasonably up to date with their sources. Plus there is a new stemmer added. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9605) update snowball to latest (adds Yiddish stemmer)
[ https://issues.apache.org/jira/browse/LUCENE-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232033#comment-17232033 ] ASF subversion and git services commented on LUCENE-9605: - Commit 52f581e351aa143dab973fc7bc191f90d7caec51 in lucene-solr's branch refs/heads/master from Robert Muir [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=52f581e ] LUCENE-9605: update snowball to d8cf01ddf37a, adds Yiddish (#2077) > update snowball to latest (adds Yiddish stemmer) > > > Key: LUCENE-9605 > URL: https://issues.apache.org/jira/browse/LUCENE-9605 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Robert Muir >Priority: Major > Fix For: master (9.0) > > Time Spent: 1.5h > Remaining Estimate: 0h > > I'm trying to find time to upstream our snowball diffs... it helps to be > reasonably up to date with their sources. Plus there is a new stemmer added. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9378) Configurable compression for BinaryDocValues
[ https://issues.apache.org/jira/browse/LUCENE-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232040#comment-17232040 ] ASF subversion and git services commented on LUCENE-9378: - Commit 9b4f8235aac5021aaed348055eb7ca0f593804fd in lucene-solr's branch refs/heads/branch_8x from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=9b4f823 ] LUCENE-9378: Fix test failure. > Configurable compression for BinaryDocValues > > > Key: LUCENE-9378 > URL: https://issues.apache.org/jira/browse/LUCENE-9378 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Viral Gandhi >Priority: Major > Fix For: 8.8 > > Attachments: hotspots-v76x.png, hotspots-v76x.png, hotspots-v76x.png, > hotspots-v76x.png, hotspots-v76x.png, hotspots-v77x.png, hotspots-v77x.png, > hotspots-v77x.png, hotspots-v77x.png, image-2020-06-12-22-17-30-339.png, > image-2020-06-12-22-17-53-961.png, image-2020-06-12-22-18-24-527.png, > image-2020-06-12-22-18-48-919.png, snapshot-v77x.nps, snapshot-v77x.nps, > snapshot-v77x.nps, snapshots-v76x.nps, snapshots-v76x.nps, snapshots-v76x.nps > > Time Spent: 5h 40m > Remaining Estimate: 0h > > Lucene 8.5.1 includes a change to always [compress > BinaryDocValues|https://issues.apache.org/jira/browse/LUCENE-9211]. This > caused (~30%) reduction in our red-line QPS (throughput). > We think users should be given some way to opt-in for this compression > feature instead of always being enabled which can have a substantial query > time cost as we saw during our upgrade. [~mikemccand] suggested one possible > approach by introducing a *mode* in Lucene80DocValuesFormat (COMPRESSED and > UNCOMPRESSED) and allowing users to create a custom Codec subclassing the > default Codec and pick the format they want. > Idea is similar to Lucene50StoredFieldsFormat which has two modes, > Mode.BEST_SPEED and Mode.BEST_COMPRESSION. > Here's related issues for adding benchmark covering BINARY doc values > query-time performance - [https://github.com/mikemccand/luceneutil/issues/61] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9378) Configurable compression for BinaryDocValues
[ https://issues.apache.org/jira/browse/LUCENE-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232042#comment-17232042 ] ASF subversion and git services commented on LUCENE-9378: - Commit 9b4f8235aac5021aaed348055eb7ca0f593804fd in lucene-solr's branch refs/heads/branch_8x from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=9b4f823 ] LUCENE-9378: Fix test failure. > Configurable compression for BinaryDocValues > > > Key: LUCENE-9378 > URL: https://issues.apache.org/jira/browse/LUCENE-9378 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Viral Gandhi >Priority: Major > Fix For: 8.8 > > Attachments: hotspots-v76x.png, hotspots-v76x.png, hotspots-v76x.png, > hotspots-v76x.png, hotspots-v76x.png, hotspots-v77x.png, hotspots-v77x.png, > hotspots-v77x.png, hotspots-v77x.png, image-2020-06-12-22-17-30-339.png, > image-2020-06-12-22-17-53-961.png, image-2020-06-12-22-18-24-527.png, > image-2020-06-12-22-18-48-919.png, snapshot-v77x.nps, snapshot-v77x.nps, > snapshot-v77x.nps, snapshots-v76x.nps, snapshots-v76x.nps, snapshots-v76x.nps > > Time Spent: 5h 40m > Remaining Estimate: 0h > > Lucene 8.5.1 includes a change to always [compress > BinaryDocValues|https://issues.apache.org/jira/browse/LUCENE-9211]. This > caused (~30%) reduction in our red-line QPS (throughput). > We think users should be given some way to opt-in for this compression > feature instead of always being enabled which can have a substantial query > time cost as we saw during our upgrade. [~mikemccand] suggested one possible > approach by introducing a *mode* in Lucene80DocValuesFormat (COMPRESSED and > UNCOMPRESSED) and allowing users to create a custom Codec subclassing the > default Codec and pick the format they want. > Idea is similar to Lucene50StoredFieldsFormat which has two modes, > Mode.BEST_SPEED and Mode.BEST_COMPRESSION. > Here's related issues for adding benchmark covering BINARY doc values > query-time performance - [https://github.com/mikemccand/luceneutil/issues/61] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9378) Configurable compression for BinaryDocValues
[ https://issues.apache.org/jira/browse/LUCENE-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232043#comment-17232043 ] ASF subversion and git services commented on LUCENE-9378: - Commit 9b4f8235aac5021aaed348055eb7ca0f593804fd in lucene-solr's branch refs/heads/branch_8x from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=9b4f823 ] LUCENE-9378: Fix test failure. > Configurable compression for BinaryDocValues > > > Key: LUCENE-9378 > URL: https://issues.apache.org/jira/browse/LUCENE-9378 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Viral Gandhi >Priority: Major > Fix For: 8.8 > > Attachments: hotspots-v76x.png, hotspots-v76x.png, hotspots-v76x.png, > hotspots-v76x.png, hotspots-v76x.png, hotspots-v77x.png, hotspots-v77x.png, > hotspots-v77x.png, hotspots-v77x.png, image-2020-06-12-22-17-30-339.png, > image-2020-06-12-22-17-53-961.png, image-2020-06-12-22-18-24-527.png, > image-2020-06-12-22-18-48-919.png, snapshot-v77x.nps, snapshot-v77x.nps, > snapshot-v77x.nps, snapshots-v76x.nps, snapshots-v76x.nps, snapshots-v76x.nps > > Time Spent: 5h 40m > Remaining Estimate: 0h > > Lucene 8.5.1 includes a change to always [compress > BinaryDocValues|https://issues.apache.org/jira/browse/LUCENE-9211]. This > caused (~30%) reduction in our red-line QPS (throughput). > We think users should be given some way to opt-in for this compression > feature instead of always being enabled which can have a substantial query > time cost as we saw during our upgrade. [~mikemccand] suggested one possible > approach by introducing a *mode* in Lucene80DocValuesFormat (COMPRESSED and > UNCOMPRESSED) and allowing users to create a custom Codec subclassing the > default Codec and pick the format they want. > Idea is similar to Lucene50StoredFieldsFormat which has two modes, > Mode.BEST_SPEED and Mode.BEST_COMPRESSION. > Here's related issues for adding benchmark covering BINARY doc values > query-time performance - [https://github.com/mikemccand/luceneutil/issues/61] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9378) Configurable compression for BinaryDocValues
[ https://issues.apache.org/jira/browse/LUCENE-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232044#comment-17232044 ] ASF subversion and git services commented on LUCENE-9378: - Commit 9b4f8235aac5021aaed348055eb7ca0f593804fd in lucene-solr's branch refs/heads/branch_8x from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=9b4f823 ] LUCENE-9378: Fix test failure. > Configurable compression for BinaryDocValues > > > Key: LUCENE-9378 > URL: https://issues.apache.org/jira/browse/LUCENE-9378 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Viral Gandhi >Priority: Major > Fix For: 8.8 > > Attachments: hotspots-v76x.png, hotspots-v76x.png, hotspots-v76x.png, > hotspots-v76x.png, hotspots-v76x.png, hotspots-v77x.png, hotspots-v77x.png, > hotspots-v77x.png, hotspots-v77x.png, image-2020-06-12-22-17-30-339.png, > image-2020-06-12-22-17-53-961.png, image-2020-06-12-22-18-24-527.png, > image-2020-06-12-22-18-48-919.png, snapshot-v77x.nps, snapshot-v77x.nps, > snapshot-v77x.nps, snapshots-v76x.nps, snapshots-v76x.nps, snapshots-v76x.nps > > Time Spent: 5h 40m > Remaining Estimate: 0h > > Lucene 8.5.1 includes a change to always [compress > BinaryDocValues|https://issues.apache.org/jira/browse/LUCENE-9211]. This > caused (~30%) reduction in our red-line QPS (throughput). > We think users should be given some way to opt-in for this compression > feature instead of always being enabled which can have a substantial query > time cost as we saw during our upgrade. [~mikemccand] suggested one possible > approach by introducing a *mode* in Lucene80DocValuesFormat (COMPRESSED and > UNCOMPRESSED) and allowing users to create a custom Codec subclassing the > default Codec and pick the format they want. > Idea is similar to Lucene50StoredFieldsFormat which has two modes, > Mode.BEST_SPEED and Mode.BEST_COMPRESSION. > Here's related issues for adding benchmark covering BINARY doc values > query-time performance - [https://github.com/mikemccand/luceneutil/issues/61] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9378) Configurable compression for BinaryDocValues
[ https://issues.apache.org/jira/browse/LUCENE-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232045#comment-17232045 ] ASF subversion and git services commented on LUCENE-9378: - Commit 9b4f8235aac5021aaed348055eb7ca0f593804fd in lucene-solr's branch refs/heads/branch_8x from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=9b4f823 ] LUCENE-9378: Fix test failure. > Configurable compression for BinaryDocValues > > > Key: LUCENE-9378 > URL: https://issues.apache.org/jira/browse/LUCENE-9378 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Viral Gandhi >Priority: Major > Fix For: 8.8 > > Attachments: hotspots-v76x.png, hotspots-v76x.png, hotspots-v76x.png, > hotspots-v76x.png, hotspots-v76x.png, hotspots-v77x.png, hotspots-v77x.png, > hotspots-v77x.png, hotspots-v77x.png, image-2020-06-12-22-17-30-339.png, > image-2020-06-12-22-17-53-961.png, image-2020-06-12-22-18-24-527.png, > image-2020-06-12-22-18-48-919.png, snapshot-v77x.nps, snapshot-v77x.nps, > snapshot-v77x.nps, snapshots-v76x.nps, snapshots-v76x.nps, snapshots-v76x.nps > > Time Spent: 5h 40m > Remaining Estimate: 0h > > Lucene 8.5.1 includes a change to always [compress > BinaryDocValues|https://issues.apache.org/jira/browse/LUCENE-9211]. This > caused (~30%) reduction in our red-line QPS (throughput). > We think users should be given some way to opt-in for this compression > feature instead of always being enabled which can have a substantial query > time cost as we saw during our upgrade. [~mikemccand] suggested one possible > approach by introducing a *mode* in Lucene80DocValuesFormat (COMPRESSED and > UNCOMPRESSED) and allowing users to create a custom Codec subclassing the > default Codec and pick the format they want. > Idea is similar to Lucene50StoredFieldsFormat which has two modes, > Mode.BEST_SPEED and Mode.BEST_COMPRESSION. > Here's related issues for adding benchmark covering BINARY doc values > query-time performance - [https://github.com/mikemccand/luceneutil/issues/61] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9610) Fix failures of TestKnnGraph.testMerge
[ https://issues.apache.org/jira/browse/LUCENE-9610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232049#comment-17232049 ] Michael Sokolov commented on LUCENE-9610: - What's happened is that when we impose max-connections fanout limits on the graph it is no longer guaranteed to have the property that every node is reachable. If a single node is far away from all the others, none of them will have it among their k nearest neighbors. I'll relax the test to only assert this property when the number of nodes in the graph is smaller than the maxConn, and then also sometimes test with maxConn very large. > Fix failures of TestKnnGraph.testMerge > -- > > Key: LUCENE-9610 > URL: https://issues.apache.org/jira/browse/LUCENE-9610 > Project: Lucene - Core > Issue Type: Bug >Reporter: Michael Sokolov >Priority: Major > > I saw three failures like those below rep[orted to the mailing list last > night. The seeds do not reproduce for me, but there's clearly something up > FAILED: org.apache.lucene.index.TestKnnGraph.testMerge > Error Message: > java.lang.AssertionError: Attempted to walk entire graph but only visited 255 > expected:<257> but was:<255> > > FAILED: org.apache.lucene.index.TestKnnGraph.testMerge > Error Message: > java.lang.AssertionError: Attempted to walk entire graph but only visited 104 > expected:<105> but was:<104> > Stack Trace: > java.lang.AssertionError: Attempted to walk entire graph but only visited 104 > expected:<105> but was:<104> > at > __randomizedtesting.SeedInfo.seed([F1E895FE123F786D:42494B1A1DE4CEF8]:0) > = > FAILED: org.apache.lucene.index.TestKnnGraph.testMerge > Error Message: > java.lang.AssertionError: Attempted to walk entire graph but only visited 104 > expected:<105> but was:<104> > Stack Trace: > java.lang.AssertionError: Attempted to walk entire graph but only visited 104 > expected:<105> but was:<104> > at > __randomizedtesting.SeedInfo.seed([F1E895FE123F786D:42494B1A1DE4CEF8]:0) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9610) Fix failures of TestKnnGraph.testMerge
[ https://issues.apache.org/jira/browse/LUCENE-9610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232055#comment-17232055 ] ASF subversion and git services commented on LUCENE-9610: - Commit 09f78e2927c6f28b9d0ce6a744a53f65999635f6 in lucene-solr's branch refs/heads/master from Michael Sokolov [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=09f78e2 ] LUCENE-9610: fix TestKnnGraph.testMerge > Fix failures of TestKnnGraph.testMerge > -- > > Key: LUCENE-9610 > URL: https://issues.apache.org/jira/browse/LUCENE-9610 > Project: Lucene - Core > Issue Type: Bug >Reporter: Michael Sokolov >Priority: Major > > I saw three failures like those below rep[orted to the mailing list last > night. The seeds do not reproduce for me, but there's clearly something up > FAILED: org.apache.lucene.index.TestKnnGraph.testMerge > Error Message: > java.lang.AssertionError: Attempted to walk entire graph but only visited 255 > expected:<257> but was:<255> > > FAILED: org.apache.lucene.index.TestKnnGraph.testMerge > Error Message: > java.lang.AssertionError: Attempted to walk entire graph but only visited 104 > expected:<105> but was:<104> > Stack Trace: > java.lang.AssertionError: Attempted to walk entire graph but only visited 104 > expected:<105> but was:<104> > at > __randomizedtesting.SeedInfo.seed([F1E895FE123F786D:42494B1A1DE4CEF8]:0) > = > FAILED: org.apache.lucene.index.TestKnnGraph.testMerge > Error Message: > java.lang.AssertionError: Attempted to walk entire graph but only visited 104 > expected:<105> but was:<104> > Stack Trace: > java.lang.AssertionError: Attempted to walk entire graph but only visited 104 > expected:<105> but was:<104> > at > __randomizedtesting.SeedInfo.seed([F1E895FE123F786D:42494B1A1DE4CEF8]:0) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9610) Fix failures of TestKnnGraph.testMerge
[ https://issues.apache.org/jira/browse/LUCENE-9610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232056#comment-17232056 ] Michael Sokolov commented on LUCENE-9610: - With hindsight, perhaps this test fix didn't require its own issue - sorry for the noisiness! > Fix failures of TestKnnGraph.testMerge > -- > > Key: LUCENE-9610 > URL: https://issues.apache.org/jira/browse/LUCENE-9610 > Project: Lucene - Core > Issue Type: Bug >Reporter: Michael Sokolov >Priority: Major > > I saw three failures like those below rep[orted to the mailing list last > night. The seeds do not reproduce for me, but there's clearly something up > FAILED: org.apache.lucene.index.TestKnnGraph.testMerge > Error Message: > java.lang.AssertionError: Attempted to walk entire graph but only visited 255 > expected:<257> but was:<255> > > FAILED: org.apache.lucene.index.TestKnnGraph.testMerge > Error Message: > java.lang.AssertionError: Attempted to walk entire graph but only visited 104 > expected:<105> but was:<104> > Stack Trace: > java.lang.AssertionError: Attempted to walk entire graph but only visited 104 > expected:<105> but was:<104> > at > __randomizedtesting.SeedInfo.seed([F1E895FE123F786D:42494B1A1DE4CEF8]:0) > = > FAILED: org.apache.lucene.index.TestKnnGraph.testMerge > Error Message: > java.lang.AssertionError: Attempted to walk entire graph but only visited 104 > expected:<105> but was:<104> > Stack Trace: > java.lang.AssertionError: Attempted to walk entire graph but only visited 104 > expected:<105> but was:<104> > at > __randomizedtesting.SeedInfo.seed([F1E895FE123F786D:42494B1A1DE4CEF8]:0) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9610) Fix failures of TestKnnGraph.testMerge
[ https://issues.apache.org/jira/browse/LUCENE-9610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Sokolov resolved LUCENE-9610. - Resolution: Fixed > Fix failures of TestKnnGraph.testMerge > -- > > Key: LUCENE-9610 > URL: https://issues.apache.org/jira/browse/LUCENE-9610 > Project: Lucene - Core > Issue Type: Bug >Reporter: Michael Sokolov >Priority: Major > > I saw three failures like those below rep[orted to the mailing list last > night. The seeds do not reproduce for me, but there's clearly something up > FAILED: org.apache.lucene.index.TestKnnGraph.testMerge > Error Message: > java.lang.AssertionError: Attempted to walk entire graph but only visited 255 > expected:<257> but was:<255> > > FAILED: org.apache.lucene.index.TestKnnGraph.testMerge > Error Message: > java.lang.AssertionError: Attempted to walk entire graph but only visited 104 > expected:<105> but was:<104> > Stack Trace: > java.lang.AssertionError: Attempted to walk entire graph but only visited 104 > expected:<105> but was:<104> > at > __randomizedtesting.SeedInfo.seed([F1E895FE123F786D:42494B1A1DE4CEF8]:0) > = > FAILED: org.apache.lucene.index.TestKnnGraph.testMerge > Error Message: > java.lang.AssertionError: Attempted to walk entire graph but only visited 104 > expected:<105> but was:<104> > Stack Trace: > java.lang.AssertionError: Attempted to walk entire graph but only visited 104 > expected:<105> but was:<104> > at > __randomizedtesting.SeedInfo.seed([F1E895FE123F786D:42494B1A1DE4CEF8]:0) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14035) remove deprecated preferLocalShards references
[ https://issues.apache.org/jira/browse/SOLR-14035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232070#comment-17232070 ] Alex Bulygin commented on SOLR-14035: - [~cpoerschke], hello! I removed use this param, [^SOLR-14035.patch] in attach. Сan I get feedback on fixes? I'm new to working with open source, today I sent mails for subscribe to distribution group, so for now I am with anonymous access to work with the repo > remove deprecated preferLocalShards references > -- > > Key: SOLR-14035 > URL: https://issues.apache.org/jira/browse/SOLR-14035 > Project: Solr > Issue Type: Task >Reporter: Christine Poerschke >Priority: Blocker > Fix For: master (9.0) > > Attachments: SOLR-14035.patch > > > {{preferLocalShards}} support was added under SOLR-6832 in version 5.1 > (https://github.com/apache/lucene-solr/blob/releases/lucene-solr/5.1.0/solr/solrj/src/java/org/apache/solr/common/params/CommonParams.java#L223-L226) > and deprecated under SOLR-11982 in version 7.4 > (https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.4.0/solr/solrj/src/java/org/apache/solr/common/params/CommonParams.java#L265-L269) > This ticket is to fully remove {{preferLocalShards}} references in code, > tests and documentation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14035) remove deprecated preferLocalShards references
[ https://issues.apache.org/jira/browse/SOLR-14035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Bulygin updated SOLR-14035: Attachment: SOLR-14035.patch > remove deprecated preferLocalShards references > -- > > Key: SOLR-14035 > URL: https://issues.apache.org/jira/browse/SOLR-14035 > Project: Solr > Issue Type: Task >Reporter: Christine Poerschke >Priority: Blocker > Fix For: master (9.0) > > Attachments: SOLR-14035.patch > > > {{preferLocalShards}} support was added under SOLR-6832 in version 5.1 > (https://github.com/apache/lucene-solr/blob/releases/lucene-solr/5.1.0/solr/solrj/src/java/org/apache/solr/common/params/CommonParams.java#L223-L226) > and deprecated under SOLR-11982 in version 7.4 > (https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.4.0/solr/solrj/src/java/org/apache/solr/common/params/CommonParams.java#L265-L269) > This ticket is to fully remove {{preferLocalShards}} references in code, > tests and documentation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mocobeta opened a new pull request #2081: LUCENE-9413: Add CJKWidthCharFilter and its factory.
mocobeta opened a new pull request #2081: URL: https://github.com/apache/lucene-solr/pull/2081 This adds a char filter (and its factory) which is the exact counterpart of o.a.l.a.cjk.CJKWidthFilter. The char filter would be useful especially for dictionary-based CJK analyzers; e.g. kuromoji. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9413) Add a char filter corresponding to CJKWidthFilter
[ https://issues.apache.org/jira/browse/LUCENE-9413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232119#comment-17232119 ] Tomoko Uchida commented on LUCENE-9413: --- [https://github.com/apache/lucene-solr/pull/2081] adds CJKWidthCharFilter that is the exact counterpart of CJKWidthFilter. The charfilter would be useful especially for dictionary-based CJK analyzers; e.g. kuromoji. [~rcmuir] what do you think - would you take a look at this? > Add a char filter corresponding to CJKWidthFilter > - > > Key: LUCENE-9413 > URL: https://issues.apache.org/jira/browse/LUCENE-9413 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Tomoko Uchida >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > In association with issues in Elasticsearch > ([https://github.com/elastic/elasticsearch/issues/58384] and > [https://github.com/elastic/elasticsearch/issues/58385]), it might be useful > for Japanese default analyzer. > Although I don't think it's a bug to not normalize FULL and HALF width > characters before tokenization, the behaviour sometimes confuses beginners or > users who have limited knowledge about Japanese analysis (and Unicode). > If we have a FULL and HALF width character normalization filter in > {{analyzers-common}}, we can include it into JapaneseAnalyzer (currently, > JapaneseAnalyzer contains CJKWidthFilter but it is applied after tokenization > so some of FULL width numbers or latin alphabets are separated by the > tokenizer). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9413) Add a char filter corresponding to CJKWidthFilter
[ https://issues.apache.org/jira/browse/LUCENE-9413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232135#comment-17232135 ] Robert Muir commented on LUCENE-9413: - Yes, I'll help review. I must have missed the PR. > Add a char filter corresponding to CJKWidthFilter > - > > Key: LUCENE-9413 > URL: https://issues.apache.org/jira/browse/LUCENE-9413 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Tomoko Uchida >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > In association with issues in Elasticsearch > ([https://github.com/elastic/elasticsearch/issues/58384] and > [https://github.com/elastic/elasticsearch/issues/58385]), it might be useful > for Japanese default analyzer. > Although I don't think it's a bug to not normalize FULL and HALF width > characters before tokenization, the behaviour sometimes confuses beginners or > users who have limited knowledge about Japanese analysis (and Unicode). > If we have a FULL and HALF width character normalization filter in > {{analyzers-common}}, we can include it into JapaneseAnalyzer (currently, > JapaneseAnalyzer contains CJKWidthFilter but it is applied after tokenization > so some of FULL width numbers or latin alphabets are separated by the > tokenizer). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13671) Remove check for bare "var" declarations in validate-source-patterns in before releasing Solr 9.0
[ https://issues.apache.org/jira/browse/SOLR-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232145#comment-17232145 ] Alex Bulygin commented on SOLR-13671: - Please tell me, [^SOLR-13671.patch]is ok? > Remove check for bare "var" declarations in validate-source-patterns in > before releasing Solr 9.0 > - > > Key: SOLR-13671 > URL: https://issues.apache.org/jira/browse/SOLR-13671 > Project: Solr > Issue Type: Improvement >Reporter: Erick Erickson >Priority: Blocker > Fix For: master (9.0) > > > See the discussion in the linked JIRA. > Remove the line: > (~$/\n\s*var\s+/$) : 'var is not allowed in until we stop development on the > 8x code line' > in > invalidJavaOnlyPatterns > from lucene/tools/src/groovy/check-source-patterns.groovy -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-13671) Remove check for bare "var" declarations in validate-source-patterns in before releasing Solr 9.0
[ https://issues.apache.org/jira/browse/SOLR-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232145#comment-17232145 ] Alex Bulygin edited comment on SOLR-13671 at 11/14/20, 8:15 PM: Please tell me, SOLR-13671.patch is ok? was (Author: alexey bulygin): Please tell me, [^SOLR-13671.patch]is ok? > Remove check for bare "var" declarations in validate-source-patterns in > before releasing Solr 9.0 > - > > Key: SOLR-13671 > URL: https://issues.apache.org/jira/browse/SOLR-13671 > Project: Solr > Issue Type: Improvement >Reporter: Erick Erickson >Priority: Blocker > Fix For: master (9.0) > > > See the discussion in the linked JIRA. > Remove the line: > (~$/\n\s*var\s+/$) : 'var is not allowed in until we stop development on the > 8x code line' > in > invalidJavaOnlyPatterns > from lucene/tools/src/groovy/check-source-patterns.groovy -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-13671) Remove check for bare "var" declarations in validate-source-patterns in before releasing Solr 9.0
[ https://issues.apache.org/jira/browse/SOLR-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232145#comment-17232145 ] Alex Bulygin edited comment on SOLR-13671 at 11/14/20, 8:16 PM: Please tell me, patch is ok? was (Author: alexey bulygin): Please tell me, SOLR-13671.patch is ok? > Remove check for bare "var" declarations in validate-source-patterns in > before releasing Solr 9.0 > - > > Key: SOLR-13671 > URL: https://issues.apache.org/jira/browse/SOLR-13671 > Project: Solr > Issue Type: Improvement >Reporter: Erick Erickson >Priority: Blocker > Fix For: master (9.0) > > Attachments: SOLR-13671.patch > > > See the discussion in the linked JIRA. > Remove the line: > (~$/\n\s*var\s+/$) : 'var is not allowed in until we stop development on the > 8x code line' > in > invalidJavaOnlyPatterns > from lucene/tools/src/groovy/check-source-patterns.groovy -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13671) Remove check for bare "var" declarations in validate-source-patterns in before releasing Solr 9.0
[ https://issues.apache.org/jira/browse/SOLR-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Bulygin updated SOLR-13671: Attachment: SOLR-13671.patch > Remove check for bare "var" declarations in validate-source-patterns in > before releasing Solr 9.0 > - > > Key: SOLR-13671 > URL: https://issues.apache.org/jira/browse/SOLR-13671 > Project: Solr > Issue Type: Improvement >Reporter: Erick Erickson >Priority: Blocker > Fix For: master (9.0) > > Attachments: SOLR-13671.patch > > > See the discussion in the linked JIRA. > Remove the line: > (~$/\n\s*var\s+/$) : 'var is not allowed in until we stop development on the > 8x code line' > in > invalidJavaOnlyPatterns > from lucene/tools/src/groovy/check-source-patterns.groovy -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13671) Remove check for bare "var" declarations in validate-source-patterns in before releasing Solr 9.0
[ https://issues.apache.org/jira/browse/SOLR-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Bulygin updated SOLR-13671: Attachment: (was: SOLR-13671.patch) > Remove check for bare "var" declarations in validate-source-patterns in > before releasing Solr 9.0 > - > > Key: SOLR-13671 > URL: https://issues.apache.org/jira/browse/SOLR-13671 > Project: Solr > Issue Type: Improvement >Reporter: Erick Erickson >Priority: Blocker > Fix For: master (9.0) > > Attachments: SOLR-13671.patch > > > See the discussion in the linked JIRA. > Remove the line: > (~$/\n\s*var\s+/$) : 'var is not allowed in until we stop development on the > 8x code line' > in > invalidJavaOnlyPatterns > from lucene/tools/src/groovy/check-source-patterns.groovy -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13671) Remove check for bare "var" declarations in validate-source-patterns in before releasing Solr 9.0
[ https://issues.apache.org/jira/browse/SOLR-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Bulygin updated SOLR-13671: Attachment: SOLR-13671.patch > Remove check for bare "var" declarations in validate-source-patterns in > before releasing Solr 9.0 > - > > Key: SOLR-13671 > URL: https://issues.apache.org/jira/browse/SOLR-13671 > Project: Solr > Issue Type: Improvement >Reporter: Erick Erickson >Priority: Blocker > Fix For: master (9.0) > > Attachments: SOLR-13671.patch > > > See the discussion in the linked JIRA. > Remove the line: > (~$/\n\s*var\s+/$) : 'var is not allowed in until we stop development on the > 8x code line' > in > invalidJavaOnlyPatterns > from lucene/tools/src/groovy/check-source-patterns.groovy -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on a change in pull request #2068: LUCENE-8982: Separate out native code to another module to allow cpp build with gradle
zacharymorn commented on a change in pull request #2068: URL: https://github.com/apache/lucene-solr/pull/2068#discussion_r523468450 ## File path: gradle/native/disable-native.gradle ## @@ -17,20 +17,65 @@ // This is the master switch to disable all tasks that compile // native (cpp) code. -def buildNative = propertyOrDefault("build.native", true).toBoolean() +rootProject.ext { + buildNative = propertyOrDefault("build.native", true).toBoolean() +} + +// Explicitly list all projects that should be configured for native extensions. +// We could scan for projects with a the cpp-library plugin but this is faster. +def nativeProjects = allprojects.findAll {it.path in [ +":lucene:misc:native" +]} + +def javaProjectsWithNativeDeps = allprojects.findAll {it.path in [ +":lucene:misc" +]} + +// Set up defaults for projects with native dependencies. +configure(javaProjectsWithNativeDeps, { + configurations { Review comment: Wow thanks Dawid so much for the detailed and thorough explanations and pointers! I did come across some of these documentations individually before, but it's just that the combination of configuration + variant + sync + cpp confuses me quite a bit. But it's pretty clear now. Thanks again for your patience and guidance here! I do have a final question though. It seems at this point we should also be able to easily support packaging the compiled native code into misc jar package (and change to use the optimized variant of the build), to remove the final manual step to copy around the native library artifact for production usage of `WindowsDirectory` or `NativeUnixDirectory`? Do you think this is a good direction to be taken ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9607) TestUniformSplitPostingFormat.testCheckIntegrityReadsAllBytes test failure
[ https://issues.apache.org/jira/browse/LUCENE-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232168#comment-17232168 ] Michael McCandless commented on LUCENE-9607: {quote}This always seems to be associated with a build of {{Lucene-Solr-cloud2refimpl-Linux}}. Maybe there is a fix that never made it to that branch? It does point to the risk of drift when we maintain these long-running branches. I don't know if there is any effort being made to track trunk there? {quote} Thanks [~sokolov] – I think you are right: this failure only seems to happen on the {{cloud2refimpl}} branch, which might be missing some recent fixes from mainline. {quote}I have my mail filters set to put all the reference impl failures in a separate mailbox, that's the only place I recall seeing this, although that's just going from my memory. {quote} OK I have just now also added a personal gmail filter to suppress CI build failure emails for this branch. We need to find a better solution than each dev having to set up their personal mail filters. Maybe CI build emails for a dev branch should be opt-in? Or, if we keep them as opt-out, the subject line of the email should make it clear that this is an experimental branch? > TestUniformSplitPostingFormat.testCheckIntegrityReadsAllBytes test failure > -- > > Key: LUCENE-9607 > URL: https://issues.apache.org/jira/browse/LUCENE-9607 > Project: Lucene - Core > Issue Type: Bug >Reporter: Michael McCandless >Priority: Major > > CI builds have been failing with this: > {noformat} > FAILED: > org.apache.lucene.codecs.uniformsplit.TestUniformSplitPostingFormat.testCheckIntegrityReadsAllBytes > Error Message: > java.lang.AssertionError > Stack Trace: > java.lang.AssertionError > at > __randomizedtesting.SeedInfo.seed([43D1E1D1DB325AD7:3E13F00D7ACC8E7E]:0) > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.lucene.codecs.uniformsplit.TestUniformSplitPostingFormat.checkEncodingCalled(TestUniformSplitPostingFormat.java:63) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:564) > at > com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1754) > at > com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:1000) > at > org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) > at > org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) > at > org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) > at > org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) > at > org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) > at > com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) > at > com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:370) > at > com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:819) > at > com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:470) > at > com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:951) > at > com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:836) > at > com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:887) > at > com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:898) > at > org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) > at > com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) > at > org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41) > at > com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) > at > com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluat
[GitHub] [lucene-solr] dsmiley commented on pull request #1972: SOLR-14915: Prometheus-exporter does not depend on Solr-core any longer
dsmiley commented on pull request #1972: URL: https://github.com/apache/lucene-solr/pull/1972#issuecomment-727526911 Based on your advise, I put back the "java-library" plugin in place of the application one, but added a "run" task. The prometheus-exporter is fundamentally dissimilar to the other contribs. The others have Solr plugins -- they run *inside* Solr, and thus don't need "lib" directories containing any of the libs that are in Solr. But I don't think the prometheus-exporter should have such a limited list. I see `gradle/solr/packaging.gradle` and it treats all contribs as the same. So I added more assemblePackage code here that ends up adding other libs like jackson because it does not exclude all so-called "Solr platform libs". Also, I ensured all server/lib/ext JARs (which are all logging) don't get copied to any contrib's libs. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #2068: LUCENE-8982: Separate out native code to another module to allow cpp build with gradle
dweiss commented on a change in pull request #2068: URL: https://github.com/apache/lucene-solr/pull/2068#discussion_r523722104 ## File path: gradle/native/disable-native.gradle ## @@ -17,20 +17,65 @@ // This is the master switch to disable all tasks that compile // native (cpp) code. -def buildNative = propertyOrDefault("build.native", true).toBoolean() +rootProject.ext { + buildNative = propertyOrDefault("build.native", true).toBoolean() +} + +// Explicitly list all projects that should be configured for native extensions. +// We could scan for projects with a the cpp-library plugin but this is faster. +def nativeProjects = allprojects.findAll {it.path in [ +":lucene:misc:native" +]} + +def javaProjectsWithNativeDeps = allprojects.findAll {it.path in [ +":lucene:misc" +]} + +// Set up defaults for projects with native dependencies. +configure(javaProjectsWithNativeDeps, { + configurations { Review comment: This is something I considered but eventually left out. Technically, it's trivial to do but my concern is that then the content of the "lucene distribution" depends on the machine you build it on - native libraries will vary and you can't (?) easily compile all libraries for the most common platforms to include all of them in the distribution... I don't know how to solve it. Unless those native extensions offer super performance boosts I'd rather treat them as "sandbox", expert-level stuff and not include them in the distribution by default? You can bring this issue up on the dev list, perhaps other people have an opinion here - I have never used those extensions myself. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org