[jira] [Commented] (LUCENE-9670) gradle precommit sometimes fails with "IOException: stream closed" from javadoc in nightly benchmarks
[ https://issues.apache.org/jira/browse/LUCENE-9670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267066#comment-17267066 ] Dawid Weiss commented on LUCENE-9670: - I would start by running gradlew in non-daemon mode, Mike. When you're using a daemon it connects to an existing process... who knows how this is handled when you're piping from Python. Add --no-daemon to all commands where you invoke gradlew. Maybe it'll help. > gradle precommit sometimes fails with "IOException: stream closed" from > javadoc in nightly benchmarks > - > > Key: LUCENE-9670 > URL: https://issues.apache.org/jira/browse/LUCENE-9670 > Project: Lucene - Core > Issue Type: Bug >Reporter: Michael McCandless >Priority: Major > > I recently added tracking how long {{gradle precommit}} takes each night so > we can track slowdowns over time. > But it sometimes fails with: > {noformat} > > Task :lucene:join:renderJavadoc FAILED > Could not read standard output of command '/opt/jdk-15.0.1/bin/javadoc'. > java.io.IOException: Stream Closed > at java.base/java.io.FileOutputStream.writeBytes(Native Method) > at java.base/java.io.FileOutputStream.write(FileOutputStream.java:347) > at > java.base/java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:81) > at > java.base/java.io.BufferedOutputStream.flush(BufferedOutputStream.java:142) > at > org.gradle.process.internal.streams.ExecOutputHandleRunner.forwardContent(ExecOutputHandleRunner.java:68) > at > org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutputHandleRunner.java:53) > at > org.gradle.internal.operations.CurrentBuildOperationPreservingRunnable.run(CurrentBuildOperationPreservingRunnable.java:42) > at > org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:64) > at > org.gradle.internal.concurrent.ManagedExecutorImpl$1.run(ManagedExecutorImpl.java:48) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) > at > org.gradle.internal.concurrent.ThreadFactoryImpl$ManagedThreadRunnable.run(ThreadFactoryImpl.java:56) > at java.base/java.lang.Thread.run(Thread.java:832) {noformat} > I'm not sure why ... when I run {{./gradlew precommit}} interactively it > doesn't seem to do this. > The nightly tool is quite simple – it just launches a sub-process using > {{os.system}}: (first to {{git clean}} then to run {{./gradlew precommit)}}: > https://github.com/mikemccand/luceneutil/blob/master/src/python/runNightlyGradleTestPrecommit.py -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] s1monw commented on a change in pull request #2212: LUCENE-9669: Add an expert API to allow opening indices created < N-1
s1monw commented on a change in pull request #2212: URL: https://github.com/apache/lucene-solr/pull/2212#discussion_r559385012 ## File path: lucene/core/src/java/org/apache/lucene/index/DirectoryReader.java ## @@ -104,6 +104,23 @@ public static DirectoryReader open(final IndexCommit commit) throws IOException return StandardDirectoryReader.open(commit.getDirectory(), commit); } + /** + * Expert: returns an IndexReader reading the index in the given {@link IndexCommit}. This method + * allows to open indices that were created wih a Lucene version older than N-1 provided that all + * all codecs for this index are available in the classpath and the segment file format used was + * created with Lucene 7 or older. Users of this API must be aware that Lucene doesn't guarantee Review comment: this is due to the fact that the segments info format only supports 7.0 and upwards This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9674) Faster advance on Vector Values
Anand Kotriwal created LUCENE-9674: -- Summary: Faster advance on Vector Values Key: LUCENE-9674 URL: https://issues.apache.org/jira/browse/LUCENE-9674 Project: Lucene - Core Issue Type: Improvement Components: core/codecs Affects Versions: master (9.0) Environment: Reporter: Anand Kotriwal The advance() function in the class Lucene90VectorReader does a linear search for the target document. To make it faster we can do a binary search over the "ordToDoc" array which will make the advance operation take logarithmic time to search.This will make retrieving vectors for a sparse set of documents efficient. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9675) Expose the compression mode of the binary doc values
Jim Ferenczi created LUCENE-9675: Summary: Expose the compression mode of the binary doc values Key: LUCENE-9675 URL: https://issues.apache.org/jira/browse/LUCENE-9675 Project: Lucene - Core Issue Type: Improvement Reporter: Jim Ferenczi LUCENE-9378 introduced a way to configure the compression mode of the binary doc values. This issue is a proposal to expose this information in the attributes of each binary field. That would expose this information to external readers on a per-field basis. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9675) Expose the compression mode of the binary doc values
[ https://issues.apache.org/jira/browse/LUCENE-9675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Ferenczi updated LUCENE-9675: - Status: Open (was: Open) > Expose the compression mode of the binary doc values > > > Key: LUCENE-9675 > URL: https://issues.apache.org/jira/browse/LUCENE-9675 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Jim Ferenczi >Priority: Minor > > LUCENE-9378 introduced a way to configure the compression mode of the binary > doc values. > This issue is a proposal to expose this information in the attributes of each > binary field. > That would expose this information to external readers on a per-field basis. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9675) Expose the compression mode of the binary doc values
[ https://issues.apache.org/jira/browse/LUCENE-9675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Ferenczi updated LUCENE-9675: - Attachment: LUCENE-9675.patch Status: Open (was: Open) Here's a patch that adds the compression mode in the attributes of the FieldInfo. > Expose the compression mode of the binary doc values > > > Key: LUCENE-9675 > URL: https://issues.apache.org/jira/browse/LUCENE-9675 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Jim Ferenczi >Priority: Minor > Attachments: LUCENE-9675.patch > > > LUCENE-9378 introduced a way to configure the compression mode of the binary > doc values. > This issue is a proposal to expose this information in the attributes of each > binary field. > That would expose this information to external readers on a per-field basis. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9675) Expose the compression mode of the binary doc values
[ https://issues.apache.org/jira/browse/LUCENE-9675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267140#comment-17267140 ] Ignacio Vera commented on LUCENE-9675: -- +1 Reading through the patch, maybe the key should have extension {{.mode}} instead of {{.compression_mode}} to be consistent with the stored fields implementation? > Expose the compression mode of the binary doc values > > > Key: LUCENE-9675 > URL: https://issues.apache.org/jira/browse/LUCENE-9675 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Jim Ferenczi >Priority: Minor > Attachments: LUCENE-9675.patch > > > LUCENE-9378 introduced a way to configure the compression mode of the binary > doc values. > This issue is a proposal to expose this information in the attributes of each > binary field. > That would expose this information to external readers on a per-field basis. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (SOLR-15052) Reducing overseer bottlenecks using per-replica states
[ https://issues.apache.org/jira/browse/SOLR-15052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul reassigned SOLR-15052: - Assignee: Noble Paul > Reducing overseer bottlenecks using per-replica states > -- > > Key: SOLR-15052 > URL: https://issues.apache.org/jira/browse/SOLR-15052 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Assignee: Noble Paul >Priority: Major > Attachments: per-replica-states-gcp.pdf > > Time Spent: 10h 20m > Remaining Estimate: 0h > > This work has the same goal as SOLR-13951, that is to reduce overseer > bottlenecks by avoiding replica state updates from going to the state.json > via the overseer. However, the approach taken here is different from > SOLR-13951 and hence this work supercedes that work. > The design proposed is here: > https://docs.google.com/document/d/1xdxpzUNmTZbk0vTMZqfen9R3ArdHokLITdiISBxCFUg/edit > Briefly, > # Every replica's state will be in a separate znode nested under the > state.json. It has the name that encodes the replica name, state, leadership > status. > # An additional children watcher to be set on state.json for state changes. > # Upon a state change, a ZK multi-op to delete the previous znode and add a > new znode with new state. > Differences between this and SOLR-13951, > # In SOLR-13951, we planned to leverage shard terms for per shard states. > # As a consequence, the code changes required for SOLR-13951 were massive (we > needed a shard state provider abstraction and introduce it everywhere in the > codebase). > # This approach is a drastically simpler change and design. > Credits for this design and the PR is due to [~noble.paul]. > [~markrmil...@gmail.com], [~noble.paul] and I have collaborated on this > effort. The reference branch takes a conceptually similar (but not identical) > approach. > I shall attach a PR and performance benchmarks shortly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-15052) Reducing overseer bottlenecks using per-replica states
[ https://issues.apache.org/jira/browse/SOLR-15052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul resolved SOLR-15052. --- Fix Version/s: 8.8 Resolution: Fixed > Reducing overseer bottlenecks using per-replica states > -- > > Key: SOLR-15052 > URL: https://issues.apache.org/jira/browse/SOLR-15052 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Assignee: Noble Paul >Priority: Major > Fix For: 8.8 > > Attachments: per-replica-states-gcp.pdf > > Time Spent: 10h 20m > Remaining Estimate: 0h > > This work has the same goal as SOLR-13951, that is to reduce overseer > bottlenecks by avoiding replica state updates from going to the state.json > via the overseer. However, the approach taken here is different from > SOLR-13951 and hence this work supercedes that work. > The design proposed is here: > https://docs.google.com/document/d/1xdxpzUNmTZbk0vTMZqfen9R3ArdHokLITdiISBxCFUg/edit > Briefly, > # Every replica's state will be in a separate znode nested under the > state.json. It has the name that encodes the replica name, state, leadership > status. > # An additional children watcher to be set on state.json for state changes. > # Upon a state change, a ZK multi-op to delete the previous znode and add a > new znode with new state. > Differences between this and SOLR-13951, > # In SOLR-13951, we planned to leverage shard terms for per shard states. > # As a consequence, the code changes required for SOLR-13951 were massive (we > needed a shard state provider abstraction and introduce it everywhere in the > codebase). > # This approach is a drastically simpler change and design. > Credits for this design and the PR is due to [~noble.paul]. > [~markrmil...@gmail.com], [~noble.paul] and I have collaborated on this > effort. The reference branch takes a conceptually similar (but not identical) > approach. > I shall attach a PR and performance benchmarks shortly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] codaitya opened a new pull request #2214: LUCENE-9674:Faster advance on Vector Values
codaitya opened a new pull request #2214: URL: https://github.com/apache/lucene-solr/pull/2214 Currently the advance() function in the class Lucene90VectorReader does a linear search for the target document. This will make retrieving vectors for a sparse set of documents efficient. # Description Currently the advance() function in the class Lucene90VectorReader does a linear search for the target document. This can be an expensive operation if we are searching for sparse documents having vector fields. # Solution Implement a binary search over the "ordToDoc" array which will make the advance operation take logarithmic time to search. # Tests Added testAdvance() in class TestVectorValues. It creates an index with gaps for vector fields and randomly calls advance. # Checklist Please review the following and check all that apply: - [x] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability. - [x] I have created a Jira issue and added the issue ID to my pull request title. - [x] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended) - [x] I have developed this patch against the `master` branch. - [x] I have run `./gradlew check`. - [x] I have added tests for my changes. - [ ] I have added documentation for the [Ref Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) (for Solr changes only). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-12613) Rename "Cloud" tab as "Cluster" in Admin UI
[ https://issues.apache.org/jira/browse/SOLR-12613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267192#comment-17267192 ] David Eric Pugh commented on SOLR-12613: Definitly a fair point [~janhoy]! > Rename "Cloud" tab as "Cluster" in Admin UI > --- > > Key: SOLR-12613 > URL: https://issues.apache.org/jira/browse/SOLR-12613 > Project: Solr > Issue Type: Improvement > Components: Admin UI >Reporter: Jan Høydahl >Priority: Major > Labels: newdev > Fix For: 8.1, master (9.0) > > > Spinoff from SOLR-8207. When adding more cluster-wide functionality to the > Admin UI, it feels better to name the "Cloud" UI tab as "Cluster". > In addition to renaming the "Cloud" tab, we should also change the URL part > from {{~cloud}} to {{~cluster}}, update reference guide page names, > screenshots and references etc. > I propose this change is not introduced in 7.x due to the impact, so tagged > it as fix-version 8.0. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mhitza commented on pull request #1435: SOLR-14410: Switch from SysV init script to systemd service file
mhitza commented on pull request #1435: URL: https://github.com/apache/lucene-solr/pull/1435#issuecomment-762201796 @epugh The main difference I see between this service file and the docker configuration is that the docker container starts the service in foreground mode. This is to be understood, as docker containers run single services. And also because running systemd within docker is pretty hairy and platform-specific (can be done only if the host system is another system that has systemd available, or at least cgroups that need to be mounted readonly mode within the container). When proposing this change we discussed on the mailing list, briefly, the option to start Solr in foreground mode (see [mailinglist thread](https://mail-archives.apache.org/mod_mbox/lucene-dev/202004.mbox/%3ccafszzzxs+zh1mrscsjftyxn0kod_+6fjobxd9zhxt66fhaz...@mail.gmail.com%3e)) Regarding testing, the only option I can think of is via configuration management (e.g. ansible) targeting a VM (because docker is not as straightforward, as mentioned before, but doable). @janhoy are you referring to the service file within this PR, or something else. As far as I know, all the common distros are running systemd, unless those peers are on old distros that are no longer maintained. And on systemd systems, there shouldn't be any extra package required to run this (except for the obvious JRE requirement for Solr itself) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] epugh commented on pull request #1435: SOLR-14410: Switch from SysV init script to systemd service file
epugh commented on pull request #1435: URL: https://github.com/apache/lucene-solr/pull/1435#issuecomment-762208470 I'd be open to testing this out. Also, is this more of a 9.0 thing versus a branch 8? Seems like changing how you install Solr is a pretty big deal. I've done the upgrade a few times using the old scripts, but this seems like a breaking change since you would lose the old way right? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] epugh opened a new pull request #2215: SOLR-14067: v3 Create /contrib/scripting module with ScriptingUpdateProcessor
epugh opened a new pull request #2215: URL: https://github.com/apache/lucene-solr/pull/2215 # Description This PR supersecedes the work done in #2016, as it doesn't drag all the commits made to master. I followed the steps recommend by Joel Bernstein in another PR to clean up the commit history in creating this PR. To improve our security posture, this moves the ScriptingUpdateProcessor to a new contrib module that isn't installed in Solr by default. This is also a chance to clean up the name of the processor from the old slightly awkward name "StatelessScriptingUpdateProcessor" to a simpler name. # Solution * Created a new `/contrib/scripting` module, and move the code and tests related to scripting under it. * Updated all the references to `StatelessScriptingUpdateProcessor` to `ScriptingUpdateProcessor` in code and ref guide. # Tests Please describe the tests you've developed or run to confirm this patch implements the feature or solves the problem. # Checklist Please review the following and check all that apply: - [ X] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability. - [ X] I have created a Jira issue and added the issue ID to my pull request title. - [ X] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended) - [ X] I have developed this patch against the `master` branch. - [ X] I have run `./gradlew check`. - [ X] I have added tests for my changes. - [ X] I have added documentation for the [Ref Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) (for Solr changes only). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] epugh commented on pull request #2016: SOLR-14067 v2 Move Stateless Scripting Update Process to /contrib/scripting module
epugh commented on pull request #2016: URL: https://github.com/apache/lucene-solr/pull/2016#issuecomment-762219832 I followed the steps that Joel recommended in another thread, and created a new clean branch, #2215. I will close this one in favour of that PR which is much more legible. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] epugh closed pull request #2016: SOLR-14067 v2 Move Stateless Scripting Update Process to /contrib/scripting module
epugh closed pull request #2016: URL: https://github.com/apache/lucene-solr/pull/2016 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13756) ivy cannot download org.restlet.ext.servlet jar
[ https://issues.apache.org/jira/browse/SOLR-13756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267235#comment-17267235 ] David Eric Pugh commented on SOLR-13756: I am going to close this, based on one more search around the source.Please comment/reopen if I'm wrong on this. > ivy cannot download org.restlet.ext.servlet jar > --- > > Key: SOLR-13756 > URL: https://issues.apache.org/jira/browse/SOLR-13756 > Project: Solr > Issue Type: Bug >Reporter: Chongchen Chen >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > I checkout the project and run `ant idea`, it will try to download jars. But > https://repo1.maven.org/maven2/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.jar > will return 404 now. > [ivy:retrieve] public: tried > [ivy:retrieve] > https://repo1.maven.org/maven2/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.jar > [ivy:retrieve]:: > [ivy:retrieve]:: FAILED DOWNLOADS:: > [ivy:retrieve]:: ^ see resolution messages for details ^ :: > [ivy:retrieve]:: > [ivy:retrieve]:: > org.restlet.jee#org.restlet;2.3.0!org.restlet.jar > [ivy:retrieve]:: > org.restlet.jee#org.restlet.ext.servlet;2.3.0!org.restlet.ext.servlet.jar > [ivy:retrieve]:: -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-13756) ivy cannot download org.restlet.ext.servlet jar
[ https://issues.apache.org/jira/browse/SOLR-13756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Eric Pugh resolved SOLR-13756. Resolution: Not A Problem I believe this is "Not a Problem" since the underlying issue has been taken care of. > ivy cannot download org.restlet.ext.servlet jar > --- > > Key: SOLR-13756 > URL: https://issues.apache.org/jira/browse/SOLR-13756 > Project: Solr > Issue Type: Bug >Reporter: Chongchen Chen >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > I checkout the project and run `ant idea`, it will try to download jars. But > https://repo1.maven.org/maven2/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.jar > will return 404 now. > [ivy:retrieve] public: tried > [ivy:retrieve] > https://repo1.maven.org/maven2/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.jar > [ivy:retrieve]:: > [ivy:retrieve]:: FAILED DOWNLOADS:: > [ivy:retrieve]:: ^ see resolution messages for details ^ :: > [ivy:retrieve]:: > [ivy:retrieve]:: > org.restlet.jee#org.restlet;2.3.0!org.restlet.jar > [ivy:retrieve]:: > org.restlet.jee#org.restlet.ext.servlet;2.3.0!org.restlet.ext.servlet.jar > [ivy:retrieve]:: -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Closed] (SOLR-13756) ivy cannot download org.restlet.ext.servlet jar
[ https://issues.apache.org/jira/browse/SOLR-13756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Eric Pugh closed SOLR-13756. -- > ivy cannot download org.restlet.ext.servlet jar > --- > > Key: SOLR-13756 > URL: https://issues.apache.org/jira/browse/SOLR-13756 > Project: Solr > Issue Type: Bug >Reporter: Chongchen Chen >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > I checkout the project and run `ant idea`, it will try to download jars. But > https://repo1.maven.org/maven2/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.jar > will return 404 now. > [ivy:retrieve] public: tried > [ivy:retrieve] > https://repo1.maven.org/maven2/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.jar > [ivy:retrieve]:: > [ivy:retrieve]:: FAILED DOWNLOADS:: > [ivy:retrieve]:: ^ see resolution messages for details ^ :: > [ivy:retrieve]:: > [ivy:retrieve]:: > org.restlet.jee#org.restlet;2.3.0!org.restlet.jar > [ivy:retrieve]:: > org.restlet.jee#org.restlet.ext.servlet;2.3.0!org.restlet.ext.servlet.jar > [ivy:retrieve]:: -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13105) A visual guide to Solr Math Expressions and Streaming Expressions
[ https://issues.apache.org/jira/browse/SOLR-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267239#comment-17267239 ] David Eric Pugh commented on SOLR-13105: I tried your commands on another branch of mine that was in a similar situation, and it worked great. > A visual guide to Solr Math Expressions and Streaming Expressions > - > > Key: SOLR-13105 > URL: https://issues.apache.org/jira/browse/SOLR-13105 > Project: Solr > Issue Type: New Feature >Reporter: Joel Bernstein >Assignee: Joel Bernstein >Priority: Major > Attachments: Screen Shot 2019-01-14 at 10.56.32 AM.png, Screen Shot > 2019-02-21 at 2.14.43 PM.png, Screen Shot 2019-03-03 at 2.28.35 PM.png, > Screen Shot 2019-03-04 at 7.47.57 PM.png, Screen Shot 2019-03-13 at 10.47.47 > AM.png, Screen Shot 2019-03-30 at 6.17.04 PM.png > > Time Spent: 20m > Remaining Estimate: 0h > > Visualization is now a fundamental element of Solr Streaming Expressions and > Math Expressions. This ticket will create a visual guide to Solr Math > Expressions and Solr Streaming Expressions that includes *Apache Zeppelin* > visualization examples. > It will also cover using the JDBC expression to *analyze* and *visualize* > results from any JDBC compliant data source. > Intro from the guide: > {code:java} > Streaming Expressions exposes the capabilities of Solr Cloud as composable > functions. These functions provide a system for searching, transforming, > analyzing and visualizing data stored in Solr Cloud collections. > At a high level there are four main capabilities that will be explored in the > documentation: > * Searching, sampling and aggregating results from Solr. > * Transforming result sets after they are retrieved from Solr. > * Analyzing and modeling result sets using probability and statistics and > machine learning libraries. > * Visualizing result sets, aggregations and statistical models of the data. > {code} > > A few sample visualizations are attached to the ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-15085) EmbeddedSolrServer calls shutdown on a provided CoreContainer
Tim Owen created SOLR-15085: --- Summary: EmbeddedSolrServer calls shutdown on a provided CoreContainer Key: SOLR-15085 URL: https://issues.apache.org/jira/browse/SOLR-15085 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: Server, SolrJ Affects Versions: master (9.0) Reporter: Tim Owen There are essentially 2 ways to create an EmbeddedSolrServer object, one by passing in a CoreContainer object, and the other way creates one internally on-the-fly. The current behaviour of the close method calls shutdown on the CoreContainer, regardless of where it came from. I believe this is not good behaviour for a class that doesn't control the lifecycle of the passed-in CoreContainer. In fact, there are 4 cases among the codebase where a subclass of EmbeddedSolrServer is created just to override this behaviour (with a comment saying it's unwanted). In my use-case I create EmbeddedSolrServer instances for cores as and when I need to work with them, but the CoreContainer exists for the duration. I don't want the whole container shut down when I'm done with just one of its cores. You can workaround it by just not calling close on the EmbeddedSolrServer object, but that's risky especially if you use a try-with-resources as close is called automatically then. Fix is to keep track of whether the CoreContainer was created internally or not, and only shut it down if internal. I will attach my patch PR. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] timatbw opened a new pull request #2216: SOLR-15085 Prevent EmbeddedSolrServer calling shutdown on a CoreConta…
timatbw opened a new pull request #2216: URL: https://github.com/apache/lucene-solr/pull/2216 …iner that was passed to it # Description Prevent EmbeddedSolrServer calling shutdown on a CoreContainer that was passed to it. # Solution Now keeping track of whether the CoreContainer was provided or created internally and only calling shutdown for internally-created instances. # Tests Modified appropriate test to confirm behaviour, and removed overrides used in existing tests to workaround this issue. # Checklist Please review the following and check all that apply: - [x] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability. - [x] I have created a Jira issue and added the issue ID to my pull request title. - [x] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended) - [x] I have developed this patch against the `master` branch. - [x] I have run `./gradlew check`. - [x] I have added tests for my changes. - [ ] I have added documentation for the [Ref Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) (for Solr changes only). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-15085) EmbeddedSolrServer calls shutdown on a provided CoreContainer
[ https://issues.apache.org/jira/browse/SOLR-15085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Owen updated SOLR-15085: Labels: pull-request-available (was: ) > EmbeddedSolrServer calls shutdown on a provided CoreContainer > - > > Key: SOLR-15085 > URL: https://issues.apache.org/jira/browse/SOLR-15085 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Server, SolrJ >Affects Versions: master (9.0) >Reporter: Tim Owen >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > There are essentially 2 ways to create an EmbeddedSolrServer object, one by > passing in a CoreContainer object, and the other way creates one internally > on-the-fly. The current behaviour of the close method calls shutdown on the > CoreContainer, regardless of where it came from. > I believe this is not good behaviour for a class that doesn't control the > lifecycle of the passed-in CoreContainer. In fact, there are 4 cases among > the codebase where a subclass of EmbeddedSolrServer is created just to > override this behaviour (with a comment saying it's unwanted). > In my use-case I create EmbeddedSolrServer instances for cores as and when I > need to work with them, but the CoreContainer exists for the duration. I > don't want the whole container shut down when I'm done with just one of its > cores. You can workaround it by just not calling close on the > EmbeddedSolrServer object, but that's risky especially if you use a > try-with-resources as close is called automatically then. > Fix is to keep track of whether the CoreContainer was created internally or > not, and only shut it down if internal. I will attach my patch PR. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mhitza commented on pull request #1435: SOLR-14410: Switch from SysV init script to systemd service file
mhitza commented on pull request #1435: URL: https://github.com/apache/lucene-solr/pull/1435#issuecomment-762235551 The moment when I wrote this systemd service file I was running Solr 8, that is correct. I think it should work with Solr 9 as is. From my memory when working on updating the installer I think you should be able to run the updated installer and it shouldn't lose the old way. Because on systemd systems all SysV scripts are overtaken/"wrapped around" by systemd. So for example, even if the previous docs stated commands like `service solr start`, it should work with `systemctl start solr` as is. And you could try out a new installation of Solr using the installer `-s solr2` flag, for example. And then you could start the new service type with `systemctl start solr2` (of course you would need to stop the previous solr instance as they are running on the same port number). What I haven't tested is what happens when you run the installer without any flags on a system that has already Solr installed. As it should generate a solr.service file (thus having the same service name), but *i think* it would supplant the SysV init script. As in, if a systemd and SysV service exist with the same name the physical solr.service would precede the SysV one. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9675) Expose the compression mode of the binary doc values
[ https://issues.apache.org/jira/browse/LUCENE-9675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Ferenczi updated LUCENE-9675: - Attachment: LUCENE-9675.patch Status: Open (was: Open) Thanks [~ivera], I attached a new patch that uses the `.mode` suffix. > Expose the compression mode of the binary doc values > > > Key: LUCENE-9675 > URL: https://issues.apache.org/jira/browse/LUCENE-9675 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Jim Ferenczi >Priority: Minor > Attachments: LUCENE-9675.patch, LUCENE-9675.patch > > > LUCENE-9378 introduced a way to configure the compression mode of the binary > doc values. > This issue is a proposal to expose this information in the attributes of each > binary field. > That would expose this information to external readers on a per-field basis. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14067) Move StatelessScriptUpdateProcessor to a contrib
[ https://issues.apache.org/jira/browse/SOLR-14067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267294#comment-17267294 ] David Smiley commented on SOLR-14067: - I like the rename (I suggested it after all) -- mostly glad to see the "Stateless" part gone. Now that I look at it, I think "ScriptUpdateProcessor" is better than "ScriptingUpdateProcessor" because I think the noun form makes more sense than the verb.WDYT? > Move StatelessScriptUpdateProcessor to a contrib > > > Key: SOLR-14067 > URL: https://issues.apache.org/jira/browse/SOLR-14067 > Project: Solr > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya >Assignee: David Eric Pugh >Priority: Major > Time Spent: 4.5h > Remaining Estimate: 0h > > Move server-side scripting out of core and into a new contrib. This is > better for security. > Former description: > > We should eliminate all scripting capabilities within Solr. Let us start with > the StatelessScriptUpdateProcessor deprecation/removal. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14067) Move StatelessScriptUpdateProcessor to a contrib
[ https://issues.apache.org/jira/browse/SOLR-14067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267295#comment-17267295 ] David Smiley commented on SOLR-14067: - Ah; actually you chose "ScriptUpdateProcessor" after all, I see. I thought otherwise because the CHANGES.txt in your latest PR is incorrect. I'll review further there but wanted to discuss the name in the issue to ensure wide peer review. > Move StatelessScriptUpdateProcessor to a contrib > > > Key: SOLR-14067 > URL: https://issues.apache.org/jira/browse/SOLR-14067 > Project: Solr > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya >Assignee: David Eric Pugh >Priority: Major > Time Spent: 4.5h > Remaining Estimate: 0h > > Move server-side scripting out of core and into a new contrib. This is > better for security. > Former description: > > We should eliminate all scripting capabilities within Solr. Let us start with > the StatelessScriptUpdateProcessor deprecation/removal. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] janhoy commented on pull request #1435: SOLR-14410: Switch from SysV init script to systemd service file
janhoy commented on pull request #1435: URL: https://github.com/apache/lucene-solr/pull/1435#issuecomment-762275006 > are you referring to the service file within this PR, or something else I think you got me wrong - I said that systemd (the new script) should work ootb but the old initd style may require an extra package in modern Unix systems to even work (service command). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #2212: LUCENE-9669: Add an expert API to allow opening indices created < N-1
mikemccand commented on a change in pull request #2212: URL: https://github.com/apache/lucene-solr/pull/2212#discussion_r559531239 ## File path: lucene/core/src/java/org/apache/lucene/index/CheckIndex.java ## @@ -3900,6 +3907,12 @@ public static Options parseOptions(String[] args) { } i++; opts.dirImpl = args[i]; + } else if ("-min_version_created".equals(args[i])) { Review comment: `-min_major_version_created`? ## File path: lucene/core/src/java/org/apache/lucene/index/CheckIndex.java ## @@ -462,6 +463,11 @@ public void setInfoStream(PrintStream out, boolean verbose) { this.verbose = verbose; } + /** Set the minimum index version created for the index to check */ + public void setMinIndexVersionCreated(int minIndexVersionCreated) { Review comment: Could we consistently rename to `setMinIndexMajorVersionCreated`, and `minIndexMajorVersionCreated`? (I see e.g. in `SIS.readCommit` below that we include `major` in the name). ## File path: lucene/core/src/java/org/apache/lucene/index/CheckIndex.java ## @@ -3900,6 +3907,12 @@ public static Options parseOptions(String[] args) { } i++; opts.dirImpl = args[i]; + } else if ("-min_version_created".equals(args[i])) { +if (i == args.length - 1) { + throw new IllegalArgumentException("ERROR: missing value for -min_version_created"); Review comment: Hmm, we should also update the `Usage: ...` exception (around line 3928 in this modified version) to document this new option? If a user tries to `CheckIndex` a too-old index without this option they'll see a `IndexFormatTooOldException` right? Should we catch that and rethrow w/ better message suggesting to use this option? Should we maybe by default just set this option (always allow `CheckIndex` on a too-old index as long as you have the old Codecs around...)? ## File path: lucene/core/src/java/org/apache/lucene/index/DirectoryReader.java ## @@ -104,6 +104,23 @@ public static DirectoryReader open(final IndexCommit commit) throws IOException return StandardDirectoryReader.open(commit.getDirectory(), commit); } + /** + * Expert: returns an IndexReader reading the index in the given {@link IndexCommit}. This method Review comment: s/`in`/`on` ## File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java ## @@ -1009,6 +1009,14 @@ public IndexWriter(Directory d, IndexWriterConfig conf) throws IOException { changed(); } else if (reader != null) { +if (reader.segmentInfos.getIndexCreatedVersionMajor() < Version.MIN_SUPPORTED_MAJOR) { Review comment: Hmm does `addIndexes` try to verify version of the incoming index is not too old? We will keep doing that, right? I.e. the only added best effort here is when directly opening an `IndexReader` you can (with this change) now ask that older versions be allowed. ## File path: lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java ## @@ -499,7 +515,7 @@ private static Codec readCodec(DataInput input) throws IOException { throw new IllegalArgumentException( "Could not load codec '" + name -+ "'. Did you forget to add lucene-backward-codecs.jar?", ++ "'. Did you forget to add lucene-backward-codecs.jar?", Review comment: Heh ## File path: lucene/core/src/java/org/apache/lucene/index/DirectoryReader.java ## @@ -104,6 +104,23 @@ public static DirectoryReader open(final IndexCommit commit) throws IOException return StandardDirectoryReader.open(commit.getDirectory(), commit); } + /** + * Expert: returns an IndexReader reading the index in the given {@link IndexCommit}. This method + * allows to open indices that were created wih a Lucene version older than N-1 provided that all + * codecs for this index are available in the classpath and the segment file format used was + * created with Lucene 7 or older. Users of this API must be aware that Lucene doesn't guarantee + * semantic compatibility for indices created with versions older than N-1. All backwards + * compatibility aside of the file format is optional and applied on a best effort basis. Review comment: s/`of`/`from` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] romseygeek commented on a change in pull request #2212: LUCENE-9669: Add an expert API to allow opening indices created < N-1
romseygeek commented on a change in pull request #2212: URL: https://github.com/apache/lucene-solr/pull/2212#discussion_r559604398 ## File path: lucene/core/src/java/org/apache/lucene/index/DirectoryReader.java ## @@ -104,6 +104,23 @@ public static DirectoryReader open(final IndexCommit commit) throws IOException return StandardDirectoryReader.open(commit.getDirectory(), commit); } + /** + * Expert: returns an IndexReader reading the index in the given {@link IndexCommit}. This method + * allows to open indices that were created wih a Lucene version older than N-1 provided that all + * all codecs for this index are available in the classpath and the segment file format used was + * created with Lucene 7 or older. Users of this API must be aware that Lucene doesn't guarantee Review comment: The javadoc should read 'Lucene 7 or newer' I think? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on a change in pull request #2216: SOLR-15085 Prevent EmbeddedSolrServer calling shutdown on a CoreConta…
madrob commented on a change in pull request #2216: URL: https://github.com/apache/lucene-solr/pull/2216#discussion_r559611053 ## File path: solr/core/src/java/org/apache/solr/client/solrj/embedded/EmbeddedSolrServer.java ## @@ -71,6 +71,7 @@ protected final String coreName; private final SolrRequestParsers _parser; private final RequestWriterSupplier supplier; + private boolean containerIsLocal = false; Review comment: Can this be final? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on a change in pull request #2215: SOLR-14067: v3 Create /contrib/scripting module with ScriptingUpdateProcessor
dsmiley commented on a change in pull request #2215: URL: https://github.com/apache/lucene-solr/pull/2215#discussion_r559588597 ## File path: solr/contrib/scripting/README.md ## @@ -0,0 +1,14 @@ +Welcome to Apache Solr Scripting! +=== + +# Introduction + +The Scripting contrib module pulls together various scripting related functions. + +Today, the ScriptUpdateProcessorFactory allows Java scripting engines to be used during the Solr document update processing, allowing dramatic flexibility in expressing custom document processing before being indexed. It also allows hooks to commit, delete, etc, but add is the most common usage. It is implemented as an UpdateProcessor to be placed in an UpdateChain. Review comment: Lets list a few popular options here -- I'm thinking JavaScript, Ruby, Python, Groovy ## File path: solr/contrib/scripting/src/java/org/apache/solr/scripting/update/ScriptUpdateProcessorFactory.java ## @@ -58,34 +60,34 @@ /** * - * An update request processor factory that enables the use of update - * processors implemented as scripts which can be loaded by the - * {@link SolrResourceLoader} (usually via the conf dir for - * the SolrCore). + * An update request processor factory that enables the use of update + * processors implemented as scripts which can be loaded by the + * {@link SolrResourceLoader} (usually via the conf dir for + * the SolrCore). Previously known as the StatelessScriptUpdateProcessor. Review comment: ```suggestion * processors implemented as scripts which can be loaded from the * configSet. Previously known as the StatelessScriptUpdateProcessor. ``` ## File path: solr/solr-ref-guide/src/scripting-update-processor.adoc ## @@ -0,0 +1,295 @@ += Scripting Update Processor Review comment: ```suggestion = Script Update Processor ``` And Can we rename this file to remove the "ing"? The PR shows this file as new; did you just write all this? ## File path: solr/solr-ref-guide/src/scripting-update-processor.adoc ## @@ -0,0 +1,295 @@ += Scripting Update Processor +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +The {solr-javadocs}/contrib/scripting/org/apache/solr/scripting/update/ScriptUpdateProcessorFactory.html[ScriptUpdateProcessor] allows Java scripting engines to be used +during Solr document update processing, allowing dramatic flexibility in +expressing custom document processing logic before being indexed. It has hooks to the +commit, delete, rollback, etc indexing actions, however add is the most common usage. +It is implemented as an UpdateProcessor to be placed in an UpdateChain. + +TIP: This used to be known as the _StatelessScriptingUpdateProcessor_ and was renamed to clarify the key aspect of this update processor is it enables scripting. + +The script can be written in any scripting language supported by your JVM (such +as JavaScript), and executed dynamically so no pre-compilation is necessary. + +WARNING: Being able to run a script of your choice as part of the indexing pipeline is a really powerful tool, that I sometimes call the +_Get out of jail free_ card because you can solve some problems this way that you can't in any other way. However, you are introducing some +potential security vulnerabilities. + +== Installing the ScriptingUpdateProcessor and Scripting Engines + +The scripting update processor lives in the contrib module `/contrib/scripting`, and you need to explicitly add it to your Solr setup. + +Java 11 and previous versions come with a JavaScript engine called Nashorn, but Java 12 will require you to add your own JavaScript engine. Other supported scripting engines like +JRuby, Jython, Groovy, all require you to add JAR files. + + +You can either add the `dist/solr-scripting-*.jar` file into Solr’s resource loader in a core `lib/` directory, or via `` directives in `solrconfig.xml`: + +[source,xml] + + + + +Likewise you will need to add some JAR files depending on which scripting engines you choose. + + +== Configuration + +[source,xml] + + + + update-script.js + + + + + +
[GitHub] [lucene-solr] s1monw commented on a change in pull request #2212: LUCENE-9669: Add an expert API to allow opening indices created < N-1
s1monw commented on a change in pull request #2212: URL: https://github.com/apache/lucene-solr/pull/2212#discussion_r559614818 ## File path: lucene/core/src/java/org/apache/lucene/index/DirectoryReader.java ## @@ -104,6 +104,23 @@ public static DirectoryReader open(final IndexCommit commit) throws IOException return StandardDirectoryReader.open(commit.getDirectory(), commit); } + /** + * Expert: returns an IndexReader reading the index in the given {@link IndexCommit}. This method + * allows to open indices that were created wih a Lucene version older than N-1 provided that all + * all codecs for this index are available in the classpath and the segment file format used was + * created with Lucene 7 or older. Users of this API must be aware that Lucene doesn't guarantee Review comment: 👍 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] s1monw commented on a change in pull request #2212: LUCENE-9669: Add an expert API to allow opening indices created < N-1
s1monw commented on a change in pull request #2212: URL: https://github.com/apache/lucene-solr/pull/2212#discussion_r559618608 ## File path: lucene/core/src/java/org/apache/lucene/index/CheckIndex.java ## @@ -3900,6 +3907,12 @@ public static Options parseOptions(String[] args) { } i++; opts.dirImpl = args[i]; + } else if ("-min_version_created".equals(args[i])) { +if (i == args.length - 1) { + throw new IllegalArgumentException("ERROR: missing value for -min_version_created"); Review comment: I am all for old indices and remove this option. WDOT? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] s1monw commented on a change in pull request #2212: LUCENE-9669: Add an expert API to allow opening indices created < N-1
s1monw commented on a change in pull request #2212: URL: https://github.com/apache/lucene-solr/pull/2212#discussion_r559620857 ## File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java ## @@ -1009,6 +1009,14 @@ public IndexWriter(Directory d, IndexWriterConfig conf) throws IOException { changed(); } else if (reader != null) { +if (reader.segmentInfos.getIndexCreatedVersionMajor() < Version.MIN_SUPPORTED_MAJOR) { Review comment: yes we verify that it's the same major as the index we are adding to. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] s1monw commented on pull request #2212: LUCENE-9669: Add an expert API to allow opening indices created < N-1
s1monw commented on pull request #2212: URL: https://github.com/apache/lucene-solr/pull/2212#issuecomment-762301076 @mikemccand pushed changes This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] timatbw commented on a change in pull request #2216: SOLR-15085 Prevent EmbeddedSolrServer calling shutdown on a CoreConta…
timatbw commented on a change in pull request #2216: URL: https://github.com/apache/lucene-solr/pull/2216#discussion_r559642959 ## File path: solr/core/src/java/org/apache/solr/client/solrj/embedded/EmbeddedSolrServer.java ## @@ -71,6 +71,7 @@ protected final String coreName; private final SolrRequestParsers _parser; private final RequestWriterSupplier supplier; + private boolean containerIsLocal = false; Review comment: I tried to do that, but it gets awkward because there's 5 constructors and one calls another which calls another. I'd have to refactor all of them to call a private constructor instead, to avoid changing the external constructor parameters. Do you think I should I do that? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] epugh commented on a change in pull request #2215: SOLR-14067: v3 Create /contrib/scripting module with ScriptingUpdateProcessor
epugh commented on a change in pull request #2215: URL: https://github.com/apache/lucene-solr/pull/2215#discussion_r559661956 ## File path: solr/solr-ref-guide/src/scripting-update-processor.adoc ## @@ -0,0 +1,295 @@ += Scripting Update Processor Review comment: Will update the name and the file. I added this file to the Ref Guide, however much (most?) of the content was sourced from the old Solr cwiki. Part of my goal in this work is to raise the profile of this powerful feature, so I wanted the great content to be visible. I did manually test all of this stuff (jython, groovy etc) when I first started working on it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul merged pull request #2177: SOLR-15052: Per-replica states for reducing overseer bottlenecks (trunk)
noblepaul merged pull request #2177: URL: https://github.com/apache/lucene-solr/pull/2177 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-15052) Reducing overseer bottlenecks using per-replica states
[ https://issues.apache.org/jira/browse/SOLR-15052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267383#comment-17267383 ] ASF subversion and git services commented on SOLR-15052: Commit 8505d4d416fdf707bab55bc4da9a71ddb3374274 in lucene-solr's branch refs/heads/master from Noble Paul [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=8505d4d ] SOLR-15052: Per-replica states for reducing overseer bottlenecks (trunk) (#2177) > Reducing overseer bottlenecks using per-replica states > -- > > Key: SOLR-15052 > URL: https://issues.apache.org/jira/browse/SOLR-15052 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Assignee: Noble Paul >Priority: Major > Fix For: 8.8 > > Attachments: per-replica-states-gcp.pdf > > Time Spent: 10.5h > Remaining Estimate: 0h > > This work has the same goal as SOLR-13951, that is to reduce overseer > bottlenecks by avoiding replica state updates from going to the state.json > via the overseer. However, the approach taken here is different from > SOLR-13951 and hence this work supercedes that work. > The design proposed is here: > https://docs.google.com/document/d/1xdxpzUNmTZbk0vTMZqfen9R3ArdHokLITdiISBxCFUg/edit > Briefly, > # Every replica's state will be in a separate znode nested under the > state.json. It has the name that encodes the replica name, state, leadership > status. > # An additional children watcher to be set on state.json for state changes. > # Upon a state change, a ZK multi-op to delete the previous znode and add a > new znode with new state. > Differences between this and SOLR-13951, > # In SOLR-13951, we planned to leverage shard terms for per shard states. > # As a consequence, the code changes required for SOLR-13951 were massive (we > needed a shard state provider abstraction and introduce it everywhere in the > codebase). > # This approach is a drastically simpler change and design. > Credits for this design and the PR is due to [~noble.paul]. > [~markrmil...@gmail.com], [~noble.paul] and I have collaborated on this > effort. The reference branch takes a conceptually similar (but not identical) > approach. > I shall attach a PR and performance benchmarks shortly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9673) The level of IntBlockPool slice is always 1
[ https://issues.apache.org/jira/browse/LUCENE-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267389#comment-17267389 ] Michael McCandless commented on LUCENE-9673: Whoa, this is horrible and probably ancient bug! The slices are supposed to increase in size as ints are appended to the logical single (chunked) stream, to make the overhead lower the larger the number of ints stored. Does {{ByteBlockPool}} have the same issue? > The level of IntBlockPool slice is always 1 > > > Key: LUCENE-9673 > URL: https://issues.apache.org/jira/browse/LUCENE-9673 > Project: Lucene - Core > Issue Type: Bug > Components: core/other >Reporter: mashudong >Priority: Minor > > First slice is allocated by IntBlockPoo.newSlice(), and its level is 1, > > {code:java} > private int newSlice(final int size) { > if (intUpto > INT_BLOCK_SIZE-size) { > nextBuffer(); > assert assertSliceBuffer(buffer); > } > > final int upto = intUpto; > intUpto += size; > buffer[intUpto-1] = 1; > return upto; > }{code} > > > If one slice is not enough, IntBlockPoo.allocSlice() is called to allocate > more slices, > as the following code shows, level is 1, newLevel is NEXT_LEVEL_ARRAY[0] > which is also 1. > > The result is the level of IntBlockPool slice is always 1, the first slice is > 2 bytes long, and all subsequent slices are 4 bytes long. > > {code:java} > private static final int[] NEXT_LEVEL_ARRAY = {1, 2, 3, 4, 5, 6, 7, 8, 9, 9}; > private int allocSlice(final int[] slice, final int sliceOffset) { > final int level = slice[sliceOffset]; > final int newLevel = NEXT_LEVEL_ARRAY[level - 1]; > final int newSize = LEVEL_SIZE_ARRAY[newLevel]; > // Maybe allocate another block > if (intUpto > INT_BLOCK_SIZE - newSize) { > nextBuffer(); > assert assertSliceBuffer(buffer); > } > final int newUpto = intUpto; > final int offset = newUpto + intOffset; > intUpto += newSize; > // Write forwarding address at end of last slice: > slice[sliceOffset] = offset; > // Write new level: > buffer[intUpto - 1] = newLevel; > return newUpto; > } > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] epugh commented on a change in pull request #2215: SOLR-14067: v3 Create /contrib/scripting module with ScriptingUpdateProcessor
epugh commented on a change in pull request #2215: URL: https://github.com/apache/lucene-solr/pull/2215#discussion_r559672654 ## File path: solr/contrib/scripting/README.md ## @@ -0,0 +1,14 @@ +Welcome to Apache Solr Scripting! +=== + +# Introduction + +The Scripting contrib module pulls together various scripting related functions. + +Today, the ScriptUpdateProcessorFactory allows Java scripting engines to be used during the Solr document update processing, allowing dramatic flexibility in expressing custom document processing before being indexed. It also allows hooks to commit, delete, etc, but add is the most common usage. It is implemented as an UpdateProcessor to be placed in an UpdateChain. Review comment: Thanks, rewroked this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9670) gradle precommit sometimes fails with "IOException: stream closed" from javadoc in nightly benchmarks
[ https://issues.apache.org/jira/browse/LUCENE-9670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267390#comment-17267390 ] Michael McCandless commented on LUCENE-9670: Thanks [~dweiss]; I'll try that. > gradle precommit sometimes fails with "IOException: stream closed" from > javadoc in nightly benchmarks > - > > Key: LUCENE-9670 > URL: https://issues.apache.org/jira/browse/LUCENE-9670 > Project: Lucene - Core > Issue Type: Bug >Reporter: Michael McCandless >Priority: Major > > I recently added tracking how long {{gradle precommit}} takes each night so > we can track slowdowns over time. > But it sometimes fails with: > {noformat} > > Task :lucene:join:renderJavadoc FAILED > Could not read standard output of command '/opt/jdk-15.0.1/bin/javadoc'. > java.io.IOException: Stream Closed > at java.base/java.io.FileOutputStream.writeBytes(Native Method) > at java.base/java.io.FileOutputStream.write(FileOutputStream.java:347) > at > java.base/java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:81) > at > java.base/java.io.BufferedOutputStream.flush(BufferedOutputStream.java:142) > at > org.gradle.process.internal.streams.ExecOutputHandleRunner.forwardContent(ExecOutputHandleRunner.java:68) > at > org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutputHandleRunner.java:53) > at > org.gradle.internal.operations.CurrentBuildOperationPreservingRunnable.run(CurrentBuildOperationPreservingRunnable.java:42) > at > org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:64) > at > org.gradle.internal.concurrent.ManagedExecutorImpl$1.run(ManagedExecutorImpl.java:48) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) > at > org.gradle.internal.concurrent.ThreadFactoryImpl$ManagedThreadRunnable.run(ThreadFactoryImpl.java:56) > at java.base/java.lang.Thread.run(Thread.java:832) {noformat} > I'm not sure why ... when I run {{./gradlew precommit}} interactively it > doesn't seem to do this. > The nightly tool is quite simple – it just launches a sub-process using > {{os.system}}: (first to {{git clean}} then to run {{./gradlew precommit)}}: > https://github.com/mikemccand/luceneutil/blob/master/src/python/runNightlyGradleTestPrecommit.py -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] epugh commented on a change in pull request #2215: SOLR-14067: v3 Create /contrib/scripting module with ScriptingUpdateProcessor
epugh commented on a change in pull request #2215: URL: https://github.com/apache/lucene-solr/pull/2215#discussion_r559674364 ## File path: solr/CHANGES.txt ## @@ -186,6 +186,9 @@ Other Changes * SOLR-14034: Remove deprecated min_rf references (Tim Dillon) +* SOLR-14067: StatelessScriptUpdateProcessor moved to it's own /contrib/scripting/ package instead + of shipping as part of Solr due to security concerns. Renamed to ScriptingUpdateProcessor. (Eric Pugh) Review comment: I don't love that we have the `Factory` suffix, as that feels like an implementation detail of how update processors work. I wish we could refer to this as the `ScriptUpdateProcessor`, even though this specific class is buried in the file `ScriptUpdateProcessorFactory.java` as: ``` private static class ScriptUpdateProcessor extends UpdateRequestProcessor ``` Would it be worth pulling the inner class to it's own `ScriptUpdateProcessor.java` file? Then we could just link to that file everywhere. Thoughts This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] epugh commented on a change in pull request #2215: SOLR-14067: v3 Create /contrib/scripting module with ScriptingUpdateProcessor
epugh commented on a change in pull request #2215: URL: https://github.com/apache/lucene-solr/pull/2215#discussion_r559695973 ## File path: solr/server/solr/configsets/sample_techproducts_configs/conf/solrconfig.xml ## @@ -674,12 +679,12 @@ *** WARNING *** Before enabling remote streaming, you should make sure your system has authentication enabled. - - +http://localhost:8983/solr/techproducts/update?commit=true&stream.contentType=text/csv&fieldnames=id,description&stream.body=1,foo&update.chain=script ``` If this is too dangerous, I could revert this change and document the need to make the change in the directions. It's just one more barrier to easily trying the feature with the tech products example. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] epugh commented on a change in pull request #2215: SOLR-14067: v3 Create /contrib/scripting module with ScriptingUpdateProcessor
epugh commented on a change in pull request #2215: URL: https://github.com/apache/lucene-solr/pull/2215#discussion_r559696636 ## File path: solr/solr-ref-guide/src/scripting-update-processor.adoc ## @@ -0,0 +1,295 @@ += Scripting Update Processor Review comment: Changes made! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] epugh commented on a change in pull request #2215: SOLR-14067: v3 Create /contrib/scripting module with ScriptingUpdateProcessor
epugh commented on a change in pull request #2215: URL: https://github.com/apache/lucene-solr/pull/2215#discussion_r559696904 ## File path: solr/solr-ref-guide/src/scripting-update-processor.adoc ## @@ -0,0 +1,295 @@ += Scripting Update Processor +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +The {solr-javadocs}/contrib/scripting/org/apache/solr/scripting/update/ScriptUpdateProcessorFactory.html[ScriptUpdateProcessor] allows Java scripting engines to be used +during Solr document update processing, allowing dramatic flexibility in +expressing custom document processing logic before being indexed. It has hooks to the +commit, delete, rollback, etc indexing actions, however add is the most common usage. +It is implemented as an UpdateProcessor to be placed in an UpdateChain. + +TIP: This used to be known as the _StatelessScriptingUpdateProcessor_ and was renamed to clarify the key aspect of this update processor is it enables scripting. + +The script can be written in any scripting language supported by your JVM (such +as JavaScript), and executed dynamically so no pre-compilation is necessary. + +WARNING: Being able to run a script of your choice as part of the indexing pipeline is a really powerful tool, that I sometimes call the +_Get out of jail free_ card because you can solve some problems this way that you can't in any other way. However, you are introducing some +potential security vulnerabilities. + +== Installing the ScriptingUpdateProcessor and Scripting Engines + +The scripting update processor lives in the contrib module `/contrib/scripting`, and you need to explicitly add it to your Solr setup. + +Java 11 and previous versions come with a JavaScript engine called Nashorn, but Java 12 will require you to add your own JavaScript engine. Other supported scripting engines like +JRuby, Jython, Groovy, all require you to add JAR files. + + +You can either add the `dist/solr-scripting-*.jar` file into Solr’s resource loader in a core `lib/` directory, or via `` directives in `solrconfig.xml`: + +[source,xml] + + + + +Likewise you will need to add some JAR files depending on which scripting engines you choose. + + +== Configuration + +[source,xml] + + + + update-script.js + + + + + + + +NOTE: The processor supports the defaults/appends/invariants concept for its config. +However, it is also possible to skip this level and configure the parameters directly underneath the `` tag. + +Below follows a list of each configuration parameters and their meaning: + +`script`:: +The script file name. The script file must be placed in the `conf/ directory. +There can be one or more "script" parameters specified; multiple scripts are executed in the order specified. + +`engine`:: +Optionally specifies the scripting engine to use. This is only needed if the extension +of the script file is not a standard mapping to the scripting engine. For example, if your +script file was coded in JavaScript but the file name was called `update-script.foo`, +use "javascript" as the engine name. + +`params`:: +Optional parameters that are passed into the script execution context. This is +specified as a named list (``) structure with nested typed parameters. If +specified, the script context will get a "params" object, otherwise there will be no "params" object available. + + +== Script execution context + +Every script has some variables provided to it. + +`logger`:: +Logger (org.slf4j.Logger) instance. This is useful for logging information from the script. + +`req`:: +{solr-javadocs}/core/org/apache/solr/response/SolrQueryResponse.html[SolrQueryRequest] instance. + +`rsp`:: +{solr-javadocs}/core/org/apache/solr/response/SolrQueryResponse.html[SolrQueryResponse] instance. + +`params`:: +The "params" object, if any specified, from the configuration. + +== Examples + +The `processAdd()` and the other script methods can return false to skip further +processing of the document. All methods must be defined, though generally the +`processAdd()` method is where the action is. + +Here's a URL that works with the techproducts example setup demonstrating specifying +the "script" update chain: `ht
[GitHub] [lucene-solr] dsmiley commented on a change in pull request #2215: SOLR-14067: v3 Create /contrib/scripting module with ScriptingUpdateProcessor
dsmiley commented on a change in pull request #2215: URL: https://github.com/apache/lucene-solr/pull/2215#discussion_r559731569 ## File path: solr/CHANGES.txt ## @@ -186,6 +186,9 @@ Other Changes * SOLR-14034: Remove deprecated min_rf references (Tim Dillon) +* SOLR-14067: StatelessScriptUpdateProcessor moved to it's own /contrib/scripting/ package instead + of shipping as part of Solr due to security concerns. Renamed to ScriptingUpdateProcessor. (Eric Pugh) Review comment: I agree 100% on seeing \*Factory all over the config being poor for the reason you gave. It could also be argued that even the "UpdateProcessor" part is quite redundant based on where we declare it. Have you noticed changes in Lucene to how schema analysis components are resolved, affecting the Solr schema (master only)? See solr/server/solr/configsets/_default/conf/managed-schema -- `` it's beautiful.No "FilterFactory" suffix. Eventually I hope we can take the same approach throughout Solr. Lucene uses an SPI approach which means a special file listing each implementation. Something like that could be embraced. No need to separate the factory from inner class over this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on a change in pull request #2215: SOLR-14067: v3 Create /contrib/scripting module with ScriptingUpdateProcessor
dsmiley commented on a change in pull request #2215: URL: https://github.com/apache/lucene-solr/pull/2215#discussion_r559733095 ## File path: solr/server/solr/configsets/sample_techproducts_configs/conf/solrconfig.xml ## @@ -674,12 +679,12 @@ *** WARNING *** Before enabling remote streaming, you should make sure your system has authentication enabled. - - +
[jira] [Created] (LUCENE-9676) Hunspell: improve stemming of all-caps words
Peter Gromov created LUCENE-9676: Summary: Hunspell: improve stemming of all-caps words Key: LUCENE-9676 URL: https://issues.apache.org/jira/browse/LUCENE-9676 Project: Lucene - Core Issue Type: Improvement Reporter: Peter Gromov Currently words like "OPENOFFICE.ORG" result in no stems even if the dictionary contains "OpenOffice.org" -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on a change in pull request #2215: SOLR-14067: v3 Create /contrib/scripting module with ScriptingUpdateProcessor
dsmiley commented on a change in pull request #2215: URL: https://github.com/apache/lucene-solr/pull/2215#discussion_r559735328 ## File path: solr/solr-ref-guide/src/script-update-processor.adoc ## @@ -35,19 +35,19 @@ potential security vulnerabilities. The scripting update processor lives in the contrib module `/contrib/scripting`, and you need to explicitly add it to your Solr setup. -Java 11 and previous versions come with a JavaScript engine called Nashorn, but Java 12 will require you to add your own JavaScript engine. Other supported scripting engines like -JRuby, Jython, Groovy, all require you to add JAR files. - - -You can either add the `dist/solr-scripting-*.jar` file into Solr’s resource loader in a core `lib/` directory, or via `` directives in `solrconfig.xml`: +You can either add the `dist/solr-scripting-*.jar` file into Solr’s core `lib/` directory, or via `` directives in `solrconfig.xml`: Review comment: It's more probable someone would use SOLR_HOME/lib than a lib directory on a core. And FYI `` directives may be going away or be discouraged. Let's link to `<>` instead of enumerating how to do this, so we can just maintain this sort of info in one place in the ref guide. ## File path: solr/solr-ref-guide/src/script-update-processor.adoc ## @@ -267,8 +267,8 @@ def finish() { } -=== Jython - +=== Python +Python support is implemented via the https://www.jython.org/[Jython] project. Put the *standalone* `jython.jar` (the JAR that contains all the dependencies) into Solr's resource loader. Review comment: Please remove "resource loader" from this page. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] donnerpeter opened a new pull request #2217: LUCENE-9676: Hunspell: improve stemming of all-caps words
donnerpeter opened a new pull request #2217: URL: https://github.com/apache/lucene-solr/pull/2217 # Description Currently words like "OPENOFFICE.ORG" result in no stems even if the dictionary contains "OpenOffice.org" # Solution Repeat Hunspell's logic: * when encountering a mixed- or (inflectable) all-case dictionary entry, add its title-case analog as a hidden entry * use that hidden entry for stemming case variants for title- and uppercase words, but don't consider it a valid word itself * ...unless there's another explicit dictionary entry of that title case # Tests Adapted `allcaps` from Hunspell C++ repository, corrected existing `TestEscaped` to match Hunspell's behavior. # Checklist Please review the following and check all that apply: - [ ] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability. - [ ] I have created a Jira issue and added the issue ID to my pull request title. - [ ] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended) - [ ] I have developed this patch against the `master` branch. - [ ] I have run `./gradlew check`. - [ ] I have added tests for my changes. - [ ] I have added documentation for the [Ref Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) (for Solr changes only). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9671) Hunspell: shorten Stemmer.applyAffix
[ https://issues.apache.org/jira/browse/LUCENE-9671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Gromov updated LUCENE-9671: - Status: Patch Available (was: Open) > Hunspell: shorten Stemmer.applyAffix > > > Key: LUCENE-9671 > URL: https://issues.apache.org/jira/browse/LUCENE-9671 > Project: Lucene - Core > Issue Type: Bug > Components: modules/analysis >Reporter: Peter Gromov >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9676) Hunspell: improve stemming of all-caps words
[ https://issues.apache.org/jira/browse/LUCENE-9676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Gromov updated LUCENE-9676: - Status: Patch Available (was: Open) > Hunspell: improve stemming of all-caps words > > > Key: LUCENE-9676 > URL: https://issues.apache.org/jira/browse/LUCENE-9676 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Peter Gromov >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Currently words like "OPENOFFICE.ORG" result in no stems even if the > dictionary contains "OpenOffice.org" -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-15052) Reducing overseer bottlenecks using per-replica states
[ https://issues.apache.org/jira/browse/SOLR-15052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267451#comment-17267451 ] Ilan Ginzburg commented on SOLR-15052: -- [~ichattopadhyaya], following up on [~mdrob]'s message above about performance testing, have you looked at handling of DOWNNODE messages? Under some conditions (many replicas on each node for each collection) I believe the per replica state can end up being slower than a single state.json update. Moreover the current implementation serializes all such updates (this can of course be improved later). I believe that's the most unfavorable case for the per replica state strategy. > Reducing overseer bottlenecks using per-replica states > -- > > Key: SOLR-15052 > URL: https://issues.apache.org/jira/browse/SOLR-15052 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Assignee: Noble Paul >Priority: Major > Fix For: 8.8 > > Attachments: per-replica-states-gcp.pdf > > Time Spent: 10.5h > Remaining Estimate: 0h > > This work has the same goal as SOLR-13951, that is to reduce overseer > bottlenecks by avoiding replica state updates from going to the state.json > via the overseer. However, the approach taken here is different from > SOLR-13951 and hence this work supercedes that work. > The design proposed is here: > https://docs.google.com/document/d/1xdxpzUNmTZbk0vTMZqfen9R3ArdHokLITdiISBxCFUg/edit > Briefly, > # Every replica's state will be in a separate znode nested under the > state.json. It has the name that encodes the replica name, state, leadership > status. > # An additional children watcher to be set on state.json for state changes. > # Upon a state change, a ZK multi-op to delete the previous znode and add a > new znode with new state. > Differences between this and SOLR-13951, > # In SOLR-13951, we planned to leverage shard terms for per shard states. > # As a consequence, the code changes required for SOLR-13951 were massive (we > needed a shard state provider abstraction and introduce it everywhere in the > codebase). > # This approach is a drastically simpler change and design. > Credits for this design and the PR is due to [~noble.paul]. > [~markrmil...@gmail.com], [~noble.paul] and I have collaborated on this > effort. The reference branch takes a conceptually similar (but not identical) > approach. > I shall attach a PR and performance benchmarks shortly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #2199: SOLR-15055 (Take 2) Re-implement 'withCollection'
murblanc commented on a change in pull request #2199: URL: https://github.com/apache/lucene-solr/pull/2199#discussion_r559740603 ## File path: solr/core/src/java/org/apache/solr/cloud/ExclusiveSliceProperty.java ## @@ -74,8 +74,8 @@ ExclusiveSliceProperty(ClusterState clusterState, ZkNodeProps message) { this.clusterState = clusterState; String tmp = message.getStr(ZkStateReader.PROPERTY_PROP); -if (StringUtils.startsWith(tmp, OverseerCollectionMessageHandler.COLL_PROP_PREFIX) == false) { - tmp = OverseerCollectionMessageHandler.COLL_PROP_PREFIX + tmp; +if (StringUtils.startsWith(tmp, CollectionAdminParams.PROPERTY_PREFIX) == false) { Review comment: Minor: use `!` rather than `== false` (I know it's old code but you touched it :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-7913) Add stream.body support to MLT QParser
[ https://issues.apache.org/jira/browse/SOLR-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Isabelle Giguere updated SOLR-7913: --- Attachment: SOLR-7913.patch > Add stream.body support to MLT QParser > -- > > Key: SOLR-7913 > URL: https://issues.apache.org/jira/browse/SOLR-7913 > Project: Solr > Issue Type: Improvement >Reporter: Anshum Gupta >Priority: Major > Attachments: SOLR-7913.patch, SOLR-7913.patch, SOLR-7913.patch, > SOLR-7913.patch, SOLR-7913_fixTests.patch, SOLR-7913_tag_7.5.0.patch > > > Continuing from > https://issues.apache.org/jira/browse/SOLR-7639?focusedCommentId=14601011&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14601011. > It'd be good to have stream.body be supported by the mlt qparser. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #2199: SOLR-15055 (Take 2) Re-implement 'withCollection'
murblanc commented on a change in pull request #2199: URL: https://github.com/apache/lucene-solr/pull/2199#discussion_r559742269 ## File path: solr/core/src/java/org/apache/solr/cloud/api/collections/DeleteCollectionCmd.java ## @@ -92,6 +98,19 @@ public void call(ClusterState state, ZkNodeProps message, @SuppressWarnings({"ra collection = extCollection; } +PlacementPlugin placementPlugin = ocmh.overseer.getCoreContainer().getPlacementPluginFactory().createPluginInstance(); Review comment: Didn't dig into the details here, but when we delete a collection, we should just check if there's another collection that defines `withCollection` on it and refuse the delete based on that, no? Just starting to look at the PR so maybe there's a reason for doing it this way (in which case maybe adding a comment?) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-7913) Add stream.body support to MLT QParser
[ https://issues.apache.org/jira/browse/SOLR-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267462#comment-17267462 ] Isabelle Giguere commented on SOLR-7913: New patch on tag release/lucene-solr/8.5.0 I had forgotten to attach it when upgrading, last time. IMPORTANT : There was a bug in previous patches. Changes in SearchHandler and ShardRequest in the previous patches resulted in including the shard request URL in the "stream.body" passed to the MLT request. That's why test results in CloudMLTQParserTest were different, when comparing the test with the id request (testMLTQParser) and the test with stream.body (testMLTQParserStreamBody). > Add stream.body support to MLT QParser > -- > > Key: SOLR-7913 > URL: https://issues.apache.org/jira/browse/SOLR-7913 > Project: Solr > Issue Type: Improvement >Reporter: Anshum Gupta >Priority: Major > Attachments: SOLR-7913.patch, SOLR-7913.patch, SOLR-7913.patch, > SOLR-7913.patch, SOLR-7913_fixTests.patch, SOLR-7913_tag_7.5.0.patch > > > Continuing from > https://issues.apache.org/jira/browse/SOLR-7639?focusedCommentId=14601011&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14601011. > It'd be good to have stream.body be supported by the mlt qparser. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9675) Expose the compression mode of the binary doc values
[ https://issues.apache.org/jira/browse/LUCENE-9675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267463#comment-17267463 ] David Smiley commented on LUCENE-9675: -- I noticed the removal of {{meta.writeByte((byte) 0);}} (or 1) but doesn't this introduce a backwards-compatibility issue? > Expose the compression mode of the binary doc values > > > Key: LUCENE-9675 > URL: https://issues.apache.org/jira/browse/LUCENE-9675 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Jim Ferenczi >Priority: Minor > Attachments: LUCENE-9675.patch, LUCENE-9675.patch > > > LUCENE-9378 introduced a way to configure the compression mode of the binary > doc values. > This issue is a proposal to expose this information in the attributes of each > binary field. > That would expose this information to external readers on a per-field basis. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #2199: SOLR-15055 (Take 2) Re-implement 'withCollection'
murblanc commented on a change in pull request #2199: URL: https://github.com/apache/lucene-solr/pull/2199#discussion_r559749942 ## File path: solr/core/src/java/org/apache/solr/cloud/api/collections/DeleteReplicaCmd.java ## @@ -147,14 +170,27 @@ void deleteReplicaBasedOnCount(ClusterState clusterState, } } +if (placementPlugin != null) { Review comment: By adding `if (placementPlugin != null)` logic in the `*Cmd` classes, we are breaking the encapsulation that placement logic is handled by `Assign.AssignStrategy`. The only reason `*Cmd` code currently (before this PR) is even aware of the notion of `PlacementPlugin` is because `PlacementPluginAssignStrategy` configuration (rather than `LegacyAssignStrategy`) is dependent on a plugin being defined... I suggest to move all `*Cmd` vetting logic to `Assign.AssignStrategy`, so that `*Cmd` only need to pass the instance of `PlacementPlugin` to that code (and later when we finally decide how clusters are configured this will go away, and we keep `*Cmd` clean of any specific assign strategy behavior). In `Assign.AssignStrategy` the default vetting logic will be "accept all" and in `PlacementPluginAssignStrategy` we can implement the checks we like. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #2199: SOLR-15055 (Take 2) Re-implement 'withCollection'
murblanc commented on a change in pull request #2199: URL: https://github.com/apache/lucene-solr/pull/2199#discussion_r559750891 ## File path: solr/core/src/java/org/apache/solr/cloud/overseer/ReplicaMutator.java ## @@ -115,8 +116,8 @@ public ZkWriteCommand addReplicaProperty(ClusterState clusterState, ZkNodeProps String sliceName = message.getStr(ZkStateReader.SHARD_ID_PROP); String replicaName = message.getStr(ZkStateReader.REPLICA_PROP); String property = message.getStr(ZkStateReader.PROPERTY_PROP).toLowerCase(Locale.ROOT); -if (StringUtils.startsWith(property, OverseerCollectionMessageHandler.COLL_PROP_PREFIX) == false) { - property = OverseerCollectionMessageHandler.COLL_PROP_PREFIX + property; +if (StringUtils.startsWith(property, CollectionAdminParams.PROPERTY_PREFIX) == false) { Review comment: `== false` -> `!` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #2199: SOLR-15055 (Take 2) Re-implement 'withCollection'
murblanc commented on a change in pull request #2199: URL: https://github.com/apache/lucene-solr/pull/2199#discussion_r559751155 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/DeleteReplicasRequest.java ## @@ -0,0 +1,29 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +import org.apache.solr.cluster.Replica; + +import java.util.Set; + +/** + * Review comment: Javadoc needed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #2199: SOLR-15055 (Take 2) Re-implement 'withCollection'
murblanc commented on a change in pull request #2199: URL: https://github.com/apache/lucene-solr/pull/2199#discussion_r559751454 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/DeleteShardsRequest.java ## @@ -0,0 +1,27 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +import java.util.Set; + +/** + * Review comment: Javadoc needed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #2199: SOLR-15055 (Take 2) Re-implement 'withCollection'
murblanc commented on a change in pull request #2199: URL: https://github.com/apache/lucene-solr/pull/2199#discussion_r559753902 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/DeleteShardsRequest.java ## @@ -0,0 +1,27 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +import java.util.Set; + +/** + * Review comment: Also, this interface is implemented but the implementation is never used. Unless we implement a use for it in this PR, I suggest we leave it out until we actually need it. I assume we don't need it for `withCollection` because the secondary collection has to be single shard so that shard will not be deleted. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-7913) Add stream.body support to MLT QParser
[ https://issues.apache.org/jira/browse/SOLR-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Isabelle Giguere updated SOLR-7913: --- Attachment: SOLR-7913_fix-unit-test-setup.patch > Add stream.body support to MLT QParser > -- > > Key: SOLR-7913 > URL: https://issues.apache.org/jira/browse/SOLR-7913 > Project: Solr > Issue Type: Improvement >Reporter: Anshum Gupta >Priority: Major > Attachments: SOLR-7913.patch, SOLR-7913.patch, SOLR-7913.patch, > SOLR-7913.patch, SOLR-7913_fix-unit-test-setup.patch, > SOLR-7913_fixTests.patch, SOLR-7913_tag_7.5.0.patch > > > Continuing from > https://issues.apache.org/jira/browse/SOLR-7639?focusedCommentId=14601011&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14601011. > It'd be good to have stream.body be supported by the mlt qparser. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #2199: SOLR-15055 (Take 2) Re-implement 'withCollection'
murblanc commented on a change in pull request #2199: URL: https://github.com/apache/lucene-solr/pull/2199#discussion_r559755408 ## File path: solr/core/src/java/org/apache/solr/cloud/api/collections/DeleteReplicaCmd.java ## @@ -147,14 +170,27 @@ void deleteReplicaBasedOnCount(ClusterState clusterState, } } +if (placementPlugin != null) { Review comment: In `Assign.AssignStrategy` if we want to be exhaustive, we should for example reject shard splits for secondary that are targets of `withCollection` (given we refuse such targets to have more than one shard). Not saying we should do it, but the vetting infra we put in place should allow logical extension to all these aspects (with minor impact on the commands). Also, pushing all the logic to `Assign.AssignStrategy` and minimizing changes to `*Cmd` limits the impact of a regression (moving to `LegacyAssignStrategy` should be a workaround for most problems). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-7913) Add stream.body support to MLT QParser
[ https://issues.apache.org/jira/browse/SOLR-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267462#comment-17267462 ] Isabelle Giguere edited comment on SOLR-7913 at 1/18/21, 7:10 PM: -- New patch on tag release/lucene-solr/8.5.0 I had forgotten to attach it when upgrading, last time. IMPORTANT : There was a bug in previous patches. Changes in SearchHandler and ShardRequest in the previous patches resulted in including the shard request URL in the "stream.body" passed to the MLT request. That's why test results in CloudMLTQParserTest were different, when comparing the test with the id request (testMLTQParser) and the test with stream.body (testMLTQParserStreamBody). Apply SOLR-7913_fix-unit-test-setup.patch on top of SOLR-7913.patch was (Author: igiguere): New patch on tag release/lucene-solr/8.5.0 I had forgotten to attach it when upgrading, last time. IMPORTANT : There was a bug in previous patches. Changes in SearchHandler and ShardRequest in the previous patches resulted in including the shard request URL in the "stream.body" passed to the MLT request. That's why test results in CloudMLTQParserTest were different, when comparing the test with the id request (testMLTQParser) and the test with stream.body (testMLTQParserStreamBody). > Add stream.body support to MLT QParser > -- > > Key: SOLR-7913 > URL: https://issues.apache.org/jira/browse/SOLR-7913 > Project: Solr > Issue Type: Improvement >Reporter: Anshum Gupta >Priority: Major > Attachments: SOLR-7913.patch, SOLR-7913.patch, SOLR-7913.patch, > SOLR-7913.patch, SOLR-7913_fix-unit-test-setup.patch, > SOLR-7913_fixTests.patch, SOLR-7913_tag_7.5.0.patch > > > Continuing from > https://issues.apache.org/jira/browse/SOLR-7639?focusedCommentId=14601011&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14601011. > It'd be good to have stream.body be supported by the mlt qparser. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] s1monw commented on pull request #2212: LUCENE-9669: Add an expert API to allow opening indices created < N-1
s1monw commented on pull request #2212: URL: https://github.com/apache/lucene-solr/pull/2212#issuecomment-762429660 I plan to merge this during the next 24 hours thanks for the reviews This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9675) Expose the compression mode of the binary doc values
[ https://issues.apache.org/jira/browse/LUCENE-9675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267473#comment-17267473 ] Jim Ferenczi commented on LUCENE-9675: -- [~dsmiley] no because we never released a version that write/read this byte. It was added in https://issues.apache.org/jira/browse/LUCENE-9378 to make the compression configurable in 8.8 so I am just changing how we record the information. That's a different story if we release 8.8 without this patch since in this case we'd need to care about bwc. > Expose the compression mode of the binary doc values > > > Key: LUCENE-9675 > URL: https://issues.apache.org/jira/browse/LUCENE-9675 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Jim Ferenczi >Priority: Minor > Attachments: LUCENE-9675.patch, LUCENE-9675.patch > > > LUCENE-9378 introduced a way to configure the compression mode of the binary > doc values. > This issue is a proposal to expose this information in the attributes of each > binary field. > That would expose this information to external readers on a per-field basis. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #2199: SOLR-15055 (Take 2) Re-implement 'withCollection'
murblanc commented on a change in pull request #2199: URL: https://github.com/apache/lucene-solr/pull/2199#discussion_r559759523 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/impl/ModificationRequestImpl.java ## @@ -0,0 +1,106 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement.impl; + +import org.apache.solr.cluster.Replica; +import org.apache.solr.cluster.Shard; +import org.apache.solr.cluster.SolrCollection; +import org.apache.solr.cluster.placement.DeleteReplicasRequest; +import org.apache.solr.cluster.placement.DeleteShardsRequest; +import org.apache.solr.common.cloud.DocCollection; +import org.apache.solr.common.cloud.Slice; + +import java.util.HashSet; +import java.util.Set; + +/** + * Helper class to create modification request instances. + */ +public class ModificationRequestImpl { + + /** + * Create a delete replicas request. + * @param collection collection to delete replicas from + * @param replicas replicas to delete + */ + public static DeleteReplicasRequest deleteReplicasRequest(SolrCollection collection, Set replicas) { +return new DeleteReplicasRequest() { + @Override + public Set getReplicas() { +return replicas; + } + + @Override + public SolrCollection getCollection() { +return collection; + } + + @Override + public String toString() { +return "DeleteReplicasRequest{collection=" + collection.getName() + +",replicas=" + replicas; + } +}; + } + + /** + * Create a delete replicas request using the internal Solr API. + * @param docCollection Solr collection + * @param shardName shard name + * @param replicaNames replica names (aka. core-node names) + * @return + */ + public static DeleteReplicasRequest deleteReplicasRequest(DocCollection docCollection, String shardName, Set replicaNames) { +SolrCollection solrCollection = SimpleClusterAbstractionsImpl.SolrCollectionImpl.fromDocCollection(docCollection); +Shard shard = solrCollection.getShard(shardName); +Slice slice = docCollection.getSlice(shardName); +Set solrReplicas = new HashSet<>(); +replicaNames.forEach(name -> { + org.apache.solr.common.cloud.Replica replica = slice.getReplica(name); + Replica solrReplica = new SimpleClusterAbstractionsImpl.ReplicaImpl(replica.getName(), shard, replica); Review comment: Why not just do `solrReplicas.add(shard.getReplica(name))` in the `forEach`? Or an easier to read IMO (personal preference but a lambda here is fine): `for (String name : replicaNames) { solrReplicas.add(shard.getReplica(name)); }` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-7913) Add stream.body support to MLT QParser
[ https://issues.apache.org/jira/browse/SOLR-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267477#comment-17267477 ] Isabelle Giguere commented on SOLR-7913: MLT Query Parser was originally implemented to allow field queries (i.e.: myField:some text) https://issues.apache.org/jira/browse/SOLR-6248 Read specifically the discussion between Steve Molloy, Vitaliy Zhovtyuk and Anshum Gupta in the first few comments. By the time the MLT QParser was committed to SVN trunk, the query format was changed: https://issues.apache.org/jira/browse/SOLR-6248?focusedCommentId=14189235&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14189235 With this format, the input immediately following the closing curly brace is assumed to be a docId (whatever field is the unique id key in the schema) In CloudMLTQParser, without any of the patches on this ticket, the first thing that happens is to look for document by id, and if that fails, throw an exception. This whole "stream.body" discussion (or monologue) originally started because of a need to identify a document using a query on any field, not just an id. Maybe it's time to move away from the idea of "stream.body", and re-implement support for any field query in MLT QParser. > Add stream.body support to MLT QParser > -- > > Key: SOLR-7913 > URL: https://issues.apache.org/jira/browse/SOLR-7913 > Project: Solr > Issue Type: Improvement >Reporter: Anshum Gupta >Priority: Major > Attachments: SOLR-7913.patch, SOLR-7913.patch, SOLR-7913.patch, > SOLR-7913.patch, SOLR-7913_fix-unit-test-setup.patch, > SOLR-7913_fixTests.patch, SOLR-7913_tag_7.5.0.patch > > > Continuing from > https://issues.apache.org/jira/browse/SOLR-7639?focusedCommentId=14601011&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14601011. > It'd be good to have stream.body be supported by the mlt qparser. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] donnerpeter commented on pull request #2217: LUCENE-9676: Hunspell: improve stemming of all-caps words
donnerpeter commented on pull request #2217: URL: https://github.com/apache/lucene-solr/pull/2217#issuecomment-762432969 It might be easier to review commits separately This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #2199: SOLR-15055 (Take 2) Re-implement 'withCollection'
murblanc commented on a change in pull request #2199: URL: https://github.com/apache/lucene-solr/pull/2199#discussion_r559761033 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/PlacementContext.java ## @@ -0,0 +1,44 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement; + +import org.apache.solr.cluster.Cluster; + +/** + * Review comment: Javadoc ## File path: solr/core/src/java/org/apache/solr/cluster/placement/impl/PlacementContextImpl.java ## @@ -0,0 +1,39 @@ +package org.apache.solr.cluster.placement.impl; + +import org.apache.solr.client.solrj.cloud.SolrCloudManager; +import org.apache.solr.cluster.Cluster; +import org.apache.solr.cluster.placement.AttributeFetcher; +import org.apache.solr.cluster.placement.PlacementContext; +import org.apache.solr.cluster.placement.PlacementPlanFactory; + +import java.io.IOException; + +/** + * + */ +public class PlacementContextImpl implements PlacementContext { Review comment: Maybe have "Simple" somewhere in the name of this class given it's instantiating `SimpleClusterAbstractionsImpl`? ## File path: solr/core/src/java/org/apache/solr/cluster/placement/impl/ModificationRequestImpl.java ## @@ -0,0 +1,106 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.cluster.placement.impl; + +import org.apache.solr.cluster.Replica; +import org.apache.solr.cluster.Shard; +import org.apache.solr.cluster.SolrCollection; +import org.apache.solr.cluster.placement.DeleteReplicasRequest; +import org.apache.solr.cluster.placement.DeleteShardsRequest; +import org.apache.solr.common.cloud.DocCollection; +import org.apache.solr.common.cloud.Slice; + +import java.util.HashSet; +import java.util.Set; + +/** + * Helper class to create modification request instances. + */ +public class ModificationRequestImpl { + + /** + * Create a delete replicas request. + * @param collection collection to delete replicas from + * @param replicas replicas to delete + */ + public static DeleteReplicasRequest deleteReplicasRequest(SolrCollection collection, Set replicas) { +return new DeleteReplicasRequest() { + @Override + public Set getReplicas() { +return replicas; + } + + @Override + public SolrCollection getCollection() { +return collection; + } + + @Override + public String toString() { +return "DeleteReplicasRequest{collection=" + collection.getName() + +",replicas=" + replicas; + } +}; + } + + /** + * Create a delete replicas request using the internal Solr API. + * @param docCollection Solr collection + * @param shardName shard name + * @param replicaNames replica names (aka. core-node names) + * @return + */ + public static DeleteReplicasRequest deleteReplicasRequest(DocCollection docCollection, String shardName, Set replicaNames) { +SolrCollection solrCollection = SimpleClusterAbstractionsImpl.SolrCollectionImpl.fromDocCollection(docCollection); +Shard shard = solrCollection.getShard(shardName); +Slice slice = docCollection.getSlice(shardName); +Set solrReplicas = new HashSet<>(); +replicaNames.forEach(name -> { + org.apache.solr.common.cloud.Replica replica = slice.getReplica(name); + Replica solrReplica = new SimpleClusterAbstractionsImpl.ReplicaImpl(replica.getName(), shard, rep
[GitHub] [lucene-solr] murblanc commented on a change in pull request #2199: SOLR-15055 (Take 2) Re-implement 'withCollection'
murblanc commented on a change in pull request #2199: URL: https://github.com/apache/lucene-solr/pull/2199#discussion_r559762633 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/impl/SimpleClusterAbstractionsImpl.java ## @@ -324,7 +324,7 @@ public int hashCode() { return new Pair<>(replicas, leader); } -private ReplicaImpl(String replicaName, Shard shard, org.apache.solr.common.cloud.Replica sliceReplica) { +ReplicaImpl(String replicaName, Shard shard, org.apache.solr.common.cloud.Replica sliceReplica) { Review comment: Why are these no longer `private`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #2199: SOLR-15055 (Take 2) Re-implement 'withCollection'
murblanc commented on a change in pull request #2199: URL: https://github.com/apache/lucene-solr/pull/2199#discussion_r559763883 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/plugins/AffinityPlacementConfig.java ## @@ -43,14 +46,30 @@ @JsonProperty public long prioritizedFreeDiskGB; + /** + * This property defines an additional constraint that primary collections (keys) should be + * located on the same nodes as the secondary collections (values). The plugin will assume + * that the secondary collection replicas are already in place and ignore candidate nodes where + * they are not already present. + */ + @JsonProperty + public Map withCollections; + // no-arg public constructor required for deserialization public AffinityPlacementConfig() { minimalFreeDiskGB = 20L; Review comment: I prefer the no arg constructor here to call the appropriate constructor. That way logic is not replicated (might not be applicable here unless somebody replaces `withCollections = Map.of();` with `withCollections = null;` in a future commit), it looks cleaner and by tracing calls to the most complete constructor all callers are inventoried... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-7913) Add stream.body support to MLT QParser
[ https://issues.apache.org/jira/browse/SOLR-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Isabelle Giguere updated SOLR-7913: --- Attachment: SOLR-7913_negative-tests.patch > Add stream.body support to MLT QParser > -- > > Key: SOLR-7913 > URL: https://issues.apache.org/jira/browse/SOLR-7913 > Project: Solr > Issue Type: Improvement >Reporter: Anshum Gupta >Priority: Major > Attachments: SOLR-7913.patch, SOLR-7913.patch, SOLR-7913.patch, > SOLR-7913.patch, SOLR-7913_fix-unit-test-setup.patch, > SOLR-7913_fixTests.patch, SOLR-7913_negative-tests.patch, > SOLR-7913_tag_7.5.0.patch > > > Continuing from > https://issues.apache.org/jira/browse/SOLR-7639?focusedCommentId=14601011&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14601011. > It'd be good to have stream.body be supported by the mlt qparser. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #2199: SOLR-15055 (Take 2) Re-implement 'withCollection'
murblanc commented on a change in pull request #2199: URL: https://github.com/apache/lucene-solr/pull/2199#discussion_r559766377 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/plugins/AffinityPlacementFactory.java ## @@ -171,14 +174,17 @@ public AffinityPlacementConfig getConfig() { private final long prioritizedFreeDiskGB; +private final Map withCollections; Review comment: Q: a given collection can only be `withCollection` for a single secondary collection? Doesn't seem necessary... Suggestion: maintain the inverse mapping as well (a multimap, but possibly this one should be a multimap as well) to save looping through map keys checking values... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #2199: SOLR-15055 (Take 2) Re-implement 'withCollection'
murblanc commented on a change in pull request #2199: URL: https://github.com/apache/lucene-solr/pull/2199#discussion_r559766377 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/plugins/AffinityPlacementFactory.java ## @@ -171,14 +174,17 @@ public AffinityPlacementConfig getConfig() { private final long prioritizedFreeDiskGB; +private final Map withCollections; Review comment: Q: a given collection can only be `withCollection` for a single secondary collection? Doesn't seem to be a necessary limitation... Suggestion: maintain the inverse mapping as well (a multimap, but possibly this one should be a multimap as well) to save looping through map keys checking values... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-7913) Add stream.body support to MLT QParser
[ https://issues.apache.org/jira/browse/SOLR-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267477#comment-17267477 ] Isabelle Giguere edited comment on SOLR-7913 at 1/18/21, 7:42 PM: -- MLT Query Parser was originally implemented to allow field queries (i.e.: myField:some text) https://issues.apache.org/jira/browse/SOLR-6248 Read specifically the discussion between Steve Molloy, Vitaliy Zhovtyuk and Anshum Gupta in the first few comments. By the time the MLT QParser was committed to SVN trunk, the query format was changed: https://issues.apache.org/jira/browse/SOLR-6248?focusedCommentId=14189235&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14189235 With this format, the input immediately following the closing curly brace is assumed to be a docId (whatever field is the unique id key in the schema) As of now, without any of the patches on this ticket, the first thing that happens is to look for document by id, and if that fails, throw an exception. This whole "stream.body" discussion (or monologue) originally started because of a need to identify a document using a query on any field, not just an id. Maybe it's time to move away from the idea of "stream.body", and re-implement support for any field query in MLT QParser. If the extra tests added in SOLR-7913_negative-tests.patch could produce results instead of an exception, I don't think anyone would need to use stream.body with an MLT QParser query. was (Author: igiguere): MLT Query Parser was originally implemented to allow field queries (i.e.: myField:some text) https://issues.apache.org/jira/browse/SOLR-6248 Read specifically the discussion between Steve Molloy, Vitaliy Zhovtyuk and Anshum Gupta in the first few comments. By the time the MLT QParser was committed to SVN trunk, the query format was changed: https://issues.apache.org/jira/browse/SOLR-6248?focusedCommentId=14189235&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14189235 With this format, the input immediately following the closing curly brace is assumed to be a docId (whatever field is the unique id key in the schema) In CloudMLTQParser, without any of the patches on this ticket, the first thing that happens is to look for document by id, and if that fails, throw an exception. This whole "stream.body" discussion (or monologue) originally started because of a need to identify a document using a query on any field, not just an id. Maybe it's time to move away from the idea of "stream.body", and re-implement support for any field query in MLT QParser. > Add stream.body support to MLT QParser > -- > > Key: SOLR-7913 > URL: https://issues.apache.org/jira/browse/SOLR-7913 > Project: Solr > Issue Type: Improvement >Reporter: Anshum Gupta >Priority: Major > Attachments: SOLR-7913.patch, SOLR-7913.patch, SOLR-7913.patch, > SOLR-7913.patch, SOLR-7913_fix-unit-test-setup.patch, > SOLR-7913_fixTests.patch, SOLR-7913_negative-tests.patch, > SOLR-7913_tag_7.5.0.patch > > > Continuing from > https://issues.apache.org/jira/browse/SOLR-7639?focusedCommentId=14601011&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14601011. > It'd be good to have stream.body be supported by the mlt qparser. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #2199: SOLR-15055 (Take 2) Re-implement 'withCollection'
murblanc commented on a change in pull request #2199: URL: https://github.com/apache/lucene-solr/pull/2199#discussion_r559767267 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/plugins/AffinityPlacementFactory.java ## @@ -238,11 +247,87 @@ public PlacementPlan computePlacement(Cluster cluster, PlacementRequest request, // failure. Current code does fail if placement is impossible (constraint is at most one replica of a shard on any node). for (Replica.ReplicaType replicaType : Replica.ReplicaType.values()) { makePlacementDecisions(solrCollection, shardName, availabilityZones, replicaType, request.getCountReplicasToCreate(replicaType), - attrValues, replicaTypeToNodes, nodesWithReplicas, coresOnNodes, placementPlanFactory, replicaPlacements); + attrValues, replicaTypeToNodes, nodesWithReplicas, coresOnNodes, placementContext.getPlacementPlanFactory(), replicaPlacements); } } - return placementPlanFactory.createPlacementPlan(request, replicaPlacements); + return placementContext.getPlacementPlanFactory().createPlacementPlan(request, replicaPlacements); +} + +@Override +public void verifyAllowedModification(ModificationRequest modificationRequest, PlacementContext placementContext) throws PlacementModificationException, InterruptedException { + if (modificationRequest instanceof DeleteShardsRequest) { +throw new UnsupportedOperationException("not implemented yet"); + } else if (!(modificationRequest instanceof DeleteReplicasRequest)) { +throw new UnsupportedOperationException("unsupported request type " + modificationRequest.getClass().getName()); + } + DeleteReplicasRequest request = (DeleteReplicasRequest) modificationRequest; + SolrCollection secondaryCollection = request.getCollection(); + if (!withCollections.values().contains(secondaryCollection.getName())) { +return; + } + Map> secondaryNodeShardReplicas = new HashMap<>(); + secondaryCollection.shards().forEach(shard -> + shard.replicas().forEach(replica -> { +secondaryNodeShardReplicas.computeIfAbsent(replica.getNode(), n -> new HashMap<>()) +.computeIfAbsent(replica.getShard().getShardName(), s -> new AtomicInteger()) +.incrementAndGet(); + })); + + // find the colocated-with collections + Cluster cluster = placementContext.getCluster(); + Set colocatedCollections = new HashSet<>(); + AtomicReference exc = new AtomicReference<>(); Review comment: This variable and how it's handled in the lambda below (and after the lambda) it too complex. If the `forEach` is replaced by a loop (a foreach loop...) the code is a lot simpler (and I believe shorter, although I didn't try to write it). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-7913) Add stream.body support to MLT QParser
[ https://issues.apache.org/jira/browse/SOLR-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267477#comment-17267477 ] Isabelle Giguere edited comment on SOLR-7913 at 1/18/21, 7:43 PM: -- MLT Query Parser was originally implemented to allow field queries (i.e.: myField:some text) https://issues.apache.org/jira/browse/SOLR-6248 Read specifically the discussion between Steve Molloy, Vitaliy Zhovtyuk and Anshum Gupta in the first few comments. By the time the MLT QParser was committed to SVN trunk, the query format was changed: https://issues.apache.org/jira/browse/SOLR-6248?focusedCommentId=14189235&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14189235 With this format, the input immediately following the closing curly brace is assumed to be a docId (whatever field is the unique id key in the schema) As of now, without any of the patches on this ticket, the first thing that happens is to look for document by id, and if that fails, throw an exception. This whole "stream.body" discussion (or monologue) originally started because of a need to identify a document using a query on any field, not just an id. Maybe it's time to move away from the idea of "stream.body", and re-implement support for any field query in MLT QParser. If the extra tests added in "SOLR-7913_negative-tests.patch" could produce results instead of an exception, I don't think anyone would need to use stream.body with an MLT QParser query. was (Author: igiguere): MLT Query Parser was originally implemented to allow field queries (i.e.: myField:some text) https://issues.apache.org/jira/browse/SOLR-6248 Read specifically the discussion between Steve Molloy, Vitaliy Zhovtyuk and Anshum Gupta in the first few comments. By the time the MLT QParser was committed to SVN trunk, the query format was changed: https://issues.apache.org/jira/browse/SOLR-6248?focusedCommentId=14189235&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14189235 With this format, the input immediately following the closing curly brace is assumed to be a docId (whatever field is the unique id key in the schema) As of now, without any of the patches on this ticket, the first thing that happens is to look for document by id, and if that fails, throw an exception. This whole "stream.body" discussion (or monologue) originally started because of a need to identify a document using a query on any field, not just an id. Maybe it's time to move away from the idea of "stream.body", and re-implement support for any field query in MLT QParser. If the extra tests added in SOLR-7913_negative-tests.patch could produce results instead of an exception, I don't think anyone would need to use stream.body with an MLT QParser query. > Add stream.body support to MLT QParser > -- > > Key: SOLR-7913 > URL: https://issues.apache.org/jira/browse/SOLR-7913 > Project: Solr > Issue Type: Improvement >Reporter: Anshum Gupta >Priority: Major > Attachments: SOLR-7913.patch, SOLR-7913.patch, SOLR-7913.patch, > SOLR-7913.patch, SOLR-7913_fix-unit-test-setup.patch, > SOLR-7913_fixTests.patch, SOLR-7913_negative-tests.patch, > SOLR-7913_tag_7.5.0.patch > > > Continuing from > https://issues.apache.org/jira/browse/SOLR-7639?focusedCommentId=14601011&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14601011. > It'd be good to have stream.body be supported by the mlt qparser. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-7913) Add stream.body support to MLT QParser
[ https://issues.apache.org/jira/browse/SOLR-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267462#comment-17267462 ] Isabelle Giguere edited comment on SOLR-7913 at 1/18/21, 7:43 PM: -- New patch on tag release/lucene-solr/8.5.0 I had forgotten to attach it when upgrading, last time. IMPORTANT : There was a bug in previous patches. Changes in SearchHandler and ShardRequest in the previous patches resulted in including the shard request URL in the "stream.body" passed to the MLT request. That's why test results in CloudMLTQParserTest were different, when comparing the test with the id request (testMLTQParser) and the test with stream.body (testMLTQParserStreamBody). Apply "SOLR-7913_fix-unit-test-setup.patch" on top of "SOLR-7913.patch" of today. was (Author: igiguere): New patch on tag release/lucene-solr/8.5.0 I had forgotten to attach it when upgrading, last time. IMPORTANT : There was a bug in previous patches. Changes in SearchHandler and ShardRequest in the previous patches resulted in including the shard request URL in the "stream.body" passed to the MLT request. That's why test results in CloudMLTQParserTest were different, when comparing the test with the id request (testMLTQParser) and the test with stream.body (testMLTQParserStreamBody). Apply SOLR-7913_fix-unit-test-setup.patch on top of SOLR-7913.patch > Add stream.body support to MLT QParser > -- > > Key: SOLR-7913 > URL: https://issues.apache.org/jira/browse/SOLR-7913 > Project: Solr > Issue Type: Improvement >Reporter: Anshum Gupta >Priority: Major > Attachments: SOLR-7913.patch, SOLR-7913.patch, SOLR-7913.patch, > SOLR-7913.patch, SOLR-7913_fix-unit-test-setup.patch, > SOLR-7913_fixTests.patch, SOLR-7913_negative-tests.patch, > SOLR-7913_tag_7.5.0.patch > > > Continuing from > https://issues.apache.org/jira/browse/SOLR-7639?focusedCommentId=14601011&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14601011. > It'd be good to have stream.body be supported by the mlt qparser. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #2199: SOLR-15055 (Take 2) Re-implement 'withCollection'
murblanc commented on a change in pull request #2199: URL: https://github.com/apache/lucene-solr/pull/2199#discussion_r559768110 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/plugins/AffinityPlacementFactory.java ## @@ -238,11 +247,87 @@ public PlacementPlan computePlacement(Cluster cluster, PlacementRequest request, // failure. Current code does fail if placement is impossible (constraint is at most one replica of a shard on any node). for (Replica.ReplicaType replicaType : Replica.ReplicaType.values()) { makePlacementDecisions(solrCollection, shardName, availabilityZones, replicaType, request.getCountReplicasToCreate(replicaType), - attrValues, replicaTypeToNodes, nodesWithReplicas, coresOnNodes, placementPlanFactory, replicaPlacements); + attrValues, replicaTypeToNodes, nodesWithReplicas, coresOnNodes, placementContext.getPlacementPlanFactory(), replicaPlacements); } } - return placementPlanFactory.createPlacementPlan(request, replicaPlacements); + return placementContext.getPlacementPlanFactory().createPlacementPlan(request, replicaPlacements); +} + +@Override +public void verifyAllowedModification(ModificationRequest modificationRequest, PlacementContext placementContext) throws PlacementModificationException, InterruptedException { Review comment: This method should be factored out and cut into a few pieces with meaningful names to make reading easier. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #2199: SOLR-15055 (Take 2) Re-implement 'withCollection'
murblanc commented on a change in pull request #2199: URL: https://github.com/apache/lucene-solr/pull/2199#discussion_r559768110 ## File path: solr/core/src/java/org/apache/solr/cluster/placement/plugins/AffinityPlacementFactory.java ## @@ -238,11 +247,87 @@ public PlacementPlan computePlacement(Cluster cluster, PlacementRequest request, // failure. Current code does fail if placement is impossible (constraint is at most one replica of a shard on any node). for (Replica.ReplicaType replicaType : Replica.ReplicaType.values()) { makePlacementDecisions(solrCollection, shardName, availabilityZones, replicaType, request.getCountReplicasToCreate(replicaType), - attrValues, replicaTypeToNodes, nodesWithReplicas, coresOnNodes, placementPlanFactory, replicaPlacements); + attrValues, replicaTypeToNodes, nodesWithReplicas, coresOnNodes, placementContext.getPlacementPlanFactory(), replicaPlacements); } } - return placementPlanFactory.createPlacementPlan(request, replicaPlacements); + return placementContext.getPlacementPlanFactory().createPlacementPlan(request, replicaPlacements); +} + +@Override +public void verifyAllowedModification(ModificationRequest modificationRequest, PlacementContext placementContext) throws PlacementModificationException, InterruptedException { Review comment: This method should IMO be factored out and cut into a few pieces with meaningful names to make reading easier. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #2217: LUCENE-9676: Hunspell: improve stemming of all-caps words
dweiss commented on a change in pull request #2217: URL: https://github.com/apache/lucene-solr/pull/2217#discussion_r559806181 ## File path: lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/Dictionary.java ## @@ -74,6 +74,8 @@ static final char[] NOFLAGS = new char[0]; + private static final char HIDDEN_FLAG = (char) 65511; // called 'ONLYUPCASEFLAG' in Hunspell Review comment: I think you could use an explicit char here? '\uFFE7'? Not sure though because this isn't valid unicode so some validation tools may complain later on... Let's leave it. ## File path: lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/WordCase.java ## @@ -0,0 +1,59 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.analysis.hunspell; + +enum WordCase { + UPPER, + TITLE, + LOWER, + MIXED; + + static WordCase caseOf(char[] word, int length) { +boolean capitalized = Character.isUpperCase(word[0]); + +boolean seenUpper = false; +boolean seenLower = false; +for (int i = 1; i < length; i++) { + char ch = word[i]; + seenUpper = seenUpper || Character.isUpperCase(ch); + seenLower = seenLower || Character.isLowerCase(ch); +} + +return get(capitalized, seenUpper, seenLower); + } + + static WordCase caseOf(CharSequence word, int length) { +boolean capitalized = Character.isUpperCase(word.charAt(0)); + +boolean seenUpper = false; +boolean seenLower = false; +for (int i = 1; i < length; i++) { Review comment: don't know if this makes much sense to optimize but you could break the loop too if (seenLower || seenUpper) as checking further on doesn't make sense. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss merged pull request #2209: LUCENE-9671: Hunspell: shorten Stemmer.applyAffix
dweiss merged pull request #2209: URL: https://github.com/apache/lucene-solr/pull/2209 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9671) Hunspell: shorten Stemmer.applyAffix
[ https://issues.apache.org/jira/browse/LUCENE-9671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267560#comment-17267560 ] ASF subversion and git services commented on LUCENE-9671: - Commit ab08fdc6f0c9e5c7e27f053da59c619c6d9e643b in lucene-solr's branch refs/heads/master from Peter Gromov [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=ab08fdc ] LUCENE-9671: Hunspell: shorten Stemmer.applyAffix (#2209) Call stem() recursively just once with different arguments depending on various conditions. NOTE: committing in directly as this is a refactoring, not a functional change (no CHANGES.txt entry). > Hunspell: shorten Stemmer.applyAffix > > > Key: LUCENE-9671 > URL: https://issues.apache.org/jira/browse/LUCENE-9671 > Project: Lucene - Core > Issue Type: Bug > Components: modules/analysis >Reporter: Peter Gromov >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9671) Hunspell: shorten Stemmer.applyAffix
[ https://issues.apache.org/jira/browse/LUCENE-9671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated LUCENE-9671: Fix Version/s: master (9.0) Resolution: Fixed Status: Resolved (was: Patch Available) > Hunspell: shorten Stemmer.applyAffix > > > Key: LUCENE-9671 > URL: https://issues.apache.org/jira/browse/LUCENE-9671 > Project: Lucene - Core > Issue Type: Bug > Components: modules/analysis >Reporter: Peter Gromov >Priority: Major > Fix For: master (9.0) > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9676) Hunspell: improve stemming of all-caps words
[ https://issues.apache.org/jira/browse/LUCENE-9676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated LUCENE-9676: Fix Version/s: master (9.0) > Hunspell: improve stemming of all-caps words > > > Key: LUCENE-9676 > URL: https://issues.apache.org/jira/browse/LUCENE-9676 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Peter Gromov >Priority: Major > Fix For: master (9.0) > > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently words like "OPENOFFICE.ORG" result in no stems even if the > dictionary contains "OpenOffice.org" -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-9676) Hunspell: improve stemming of all-caps words
[ https://issues.apache.org/jira/browse/LUCENE-9676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss reassigned LUCENE-9676: --- Assignee: Dawid Weiss > Hunspell: improve stemming of all-caps words > > > Key: LUCENE-9676 > URL: https://issues.apache.org/jira/browse/LUCENE-9676 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Peter Gromov >Assignee: Dawid Weiss >Priority: Major > Fix For: master (9.0) > > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently words like "OPENOFFICE.ORG" result in no stems even if the > dictionary contains "OpenOffice.org" -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-15052) Reducing overseer bottlenecks using per-replica states
[ https://issues.apache.org/jira/browse/SOLR-15052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267582#comment-17267582 ] Noble Paul commented on SOLR-15052: --- The DOWNNODE is still processed by overseer as a single multi op. So, i expect the performance to be somewhat similar it better. Better because the amount of data that is written is much smaller. However I shall write a simple test to confirm it > Reducing overseer bottlenecks using per-replica states > -- > > Key: SOLR-15052 > URL: https://issues.apache.org/jira/browse/SOLR-15052 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Assignee: Noble Paul >Priority: Major > Fix For: 8.8 > > Attachments: per-replica-states-gcp.pdf > > Time Spent: 10.5h > Remaining Estimate: 0h > > This work has the same goal as SOLR-13951, that is to reduce overseer > bottlenecks by avoiding replica state updates from going to the state.json > via the overseer. However, the approach taken here is different from > SOLR-13951 and hence this work supercedes that work. > The design proposed is here: > https://docs.google.com/document/d/1xdxpzUNmTZbk0vTMZqfen9R3ArdHokLITdiISBxCFUg/edit > Briefly, > # Every replica's state will be in a separate znode nested under the > state.json. It has the name that encodes the replica name, state, leadership > status. > # An additional children watcher to be set on state.json for state changes. > # Upon a state change, a ZK multi-op to delete the previous znode and add a > new znode with new state. > Differences between this and SOLR-13951, > # In SOLR-13951, we planned to leverage shard terms for per shard states. > # As a consequence, the code changes required for SOLR-13951 were massive (we > needed a shard state provider abstraction and introduce it everywhere in the > codebase). > # This approach is a drastically simpler change and design. > Credits for this design and the PR is due to [~noble.paul]. > [~markrmil...@gmail.com], [~noble.paul] and I have collaborated on this > effort. The reference branch takes a conceptually similar (but not identical) > approach. > I shall attach a PR and performance benchmarks shortly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-15052) Reducing overseer bottlenecks using per-replica states
[ https://issues.apache.org/jira/browse/SOLR-15052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267582#comment-17267582 ] Noble Paul edited comment on SOLR-15052 at 1/19/21, 12:35 AM: -- The DOWNNODE is still processed by overseer as a single multi op. So, I expect the performance to be somewhat similar or better. Better because the amount of data that is written is much smaller However I shall write a simple test to confirm it. was (Author: noble.paul): The DOWNNODE is still processed by overseer as a single multi op. So, i expect the performance to be somewhat similar it better. Better because the amount of data that is written is much smaller. However I shall write a simple test to confirm it > Reducing overseer bottlenecks using per-replica states > -- > > Key: SOLR-15052 > URL: https://issues.apache.org/jira/browse/SOLR-15052 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Assignee: Noble Paul >Priority: Major > Fix For: 8.8 > > Attachments: per-replica-states-gcp.pdf > > Time Spent: 10.5h > Remaining Estimate: 0h > > This work has the same goal as SOLR-13951, that is to reduce overseer > bottlenecks by avoiding replica state updates from going to the state.json > via the overseer. However, the approach taken here is different from > SOLR-13951 and hence this work supercedes that work. > The design proposed is here: > https://docs.google.com/document/d/1xdxpzUNmTZbk0vTMZqfen9R3ArdHokLITdiISBxCFUg/edit > Briefly, > # Every replica's state will be in a separate znode nested under the > state.json. It has the name that encodes the replica name, state, leadership > status. > # An additional children watcher to be set on state.json for state changes. > # Upon a state change, a ZK multi-op to delete the previous znode and add a > new znode with new state. > Differences between this and SOLR-13951, > # In SOLR-13951, we planned to leverage shard terms for per shard states. > # As a consequence, the code changes required for SOLR-13951 were massive (we > needed a shard state provider abstraction and introduce it everywhere in the > codebase). > # This approach is a drastically simpler change and design. > Credits for this design and the PR is due to [~noble.paul]. > [~markrmil...@gmail.com], [~noble.paul] and I have collaborated on this > effort. The reference branch takes a conceptually similar (but not identical) > approach. > I shall attach a PR and performance benchmarks shortly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9663) Adding compression to terms dict from SortedSet/Sorted DocValues
[ https://issues.apache.org/jira/browse/LUCENE-9663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267595#comment-17267595 ] Jaison.Bi commented on LUCENE-9663: --- [~mikemccand] [~jpountz] [~sokolov] Please help to review the pull request, thanks :) > Adding compression to terms dict from SortedSet/Sorted DocValues > > > Key: LUCENE-9663 > URL: https://issues.apache.org/jira/browse/LUCENE-9663 > Project: Lucene - Core > Issue Type: Improvement > Components: core/codecs >Reporter: Jaison.Bi >Priority: Trivial > Time Spent: 10m > Remaining Estimate: 0h > > Elasticsearch keyword field uses SortedSet DocValues. In our applications, > “keyword” is the most frequently used field type. > LUCENE-7081 has done prefix-compression for docvalues terms dict. We can do > better by replacing prefix-compression with LZ4. In one of our application, > the dvd files were ~41% smaller with this change(from 1.95 GB to 1.15 GB). > I've done simple tests based on the real application data, comparing the > write/merge time cost, and the on-disk *.dvd file size(after merge into 1 > segment). > || ||Before||After|| > |Write time cost(ms)|591972|618200| > |Merge time cost(ms)|270661|294663| > |*.dvd file size(GB)|1.95|1.15| > This feature is only for the high-cardinality fields. > I'm doing the benchmark test based on luceneutil. Will attach the report and > patch after the test. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9673) The level of IntBlockPool slice is always 1
[ https://issues.apache.org/jira/browse/LUCENE-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267611#comment-17267611 ] mashudong commented on LUCENE-9673: --- Yes, it's probably an ancient bug. IMHO, ByteBlockPool do not have the same issue. > The level of IntBlockPool slice is always 1 > > > Key: LUCENE-9673 > URL: https://issues.apache.org/jira/browse/LUCENE-9673 > Project: Lucene - Core > Issue Type: Bug > Components: core/other >Reporter: mashudong >Priority: Minor > > First slice is allocated by IntBlockPoo.newSlice(), and its level is 1, > > {code:java} > private int newSlice(final int size) { > if (intUpto > INT_BLOCK_SIZE-size) { > nextBuffer(); > assert assertSliceBuffer(buffer); > } > > final int upto = intUpto; > intUpto += size; > buffer[intUpto-1] = 1; > return upto; > }{code} > > > If one slice is not enough, IntBlockPoo.allocSlice() is called to allocate > more slices, > as the following code shows, level is 1, newLevel is NEXT_LEVEL_ARRAY[0] > which is also 1. > > The result is the level of IntBlockPool slice is always 1, the first slice is > 2 bytes long, and all subsequent slices are 4 bytes long. > > {code:java} > private static final int[] NEXT_LEVEL_ARRAY = {1, 2, 3, 4, 5, 6, 7, 8, 9, 9}; > private int allocSlice(final int[] slice, final int sliceOffset) { > final int level = slice[sliceOffset]; > final int newLevel = NEXT_LEVEL_ARRAY[level - 1]; > final int newSize = LEVEL_SIZE_ARRAY[newLevel]; > // Maybe allocate another block > if (intUpto > INT_BLOCK_SIZE - newSize) { > nextBuffer(); > assert assertSliceBuffer(buffer); > } > final int newUpto = intUpto; > final int offset = newUpto + intOffset; > intUpto += newSize; > // Write forwarding address at end of last slice: > slice[sliceOffset] = offset; > // Write new level: > buffer[intUpto - 1] = newLevel; > return newUpto; > } > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9663) Adding compression to terms dict from SortedSet/Sorted DocValues
[ https://issues.apache.org/jira/browse/LUCENE-9663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jaison.Bi updated LUCENE-9663: -- Status: Patch Available (was: Open) > Adding compression to terms dict from SortedSet/Sorted DocValues > > > Key: LUCENE-9663 > URL: https://issues.apache.org/jira/browse/LUCENE-9663 > Project: Lucene - Core > Issue Type: Improvement > Components: core/codecs >Reporter: Jaison.Bi >Priority: Trivial > Time Spent: 10m > Remaining Estimate: 0h > > Elasticsearch keyword field uses SortedSet DocValues. In our applications, > “keyword” is the most frequently used field type. > LUCENE-7081 has done prefix-compression for docvalues terms dict. We can do > better by replacing prefix-compression with LZ4. In one of our application, > the dvd files were ~41% smaller with this change(from 1.95 GB to 1.15 GB). > I've done simple tests based on the real application data, comparing the > write/merge time cost, and the on-disk *.dvd file size(after merge into 1 > segment). > || ||Before||After|| > |Write time cost(ms)|591972|618200| > |Merge time cost(ms)|270661|294663| > |*.dvd file size(GB)|1.95|1.15| > This feature is only for the high-cardinality fields. > I'm doing the benchmark test based on luceneutil. Will attach the report and > patch after the test. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9675) Expose the compression mode of the binary doc values
[ https://issues.apache.org/jira/browse/LUCENE-9675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267647#comment-17267647 ] Ishan Chattopadhyaya commented on LUCENE-9675: -- When do we expect this to be resolved. I'll (or noble) build the RC once this wraps up. > Expose the compression mode of the binary doc values > > > Key: LUCENE-9675 > URL: https://issues.apache.org/jira/browse/LUCENE-9675 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Jim Ferenczi >Priority: Minor > Attachments: LUCENE-9675.patch, LUCENE-9675.patch > > > LUCENE-9378 introduced a way to configure the compression mode of the binary > doc values. > This issue is a proposal to expose this information in the attributes of each > binary field. > That would expose this information to external readers on a per-field basis. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Closed] (LUCENE-9671) Hunspell: shorten Stemmer.applyAffix
[ https://issues.apache.org/jira/browse/LUCENE-9671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Gromov closed LUCENE-9671. > Hunspell: shorten Stemmer.applyAffix > > > Key: LUCENE-9671 > URL: https://issues.apache.org/jira/browse/LUCENE-9671 > Project: Lucene - Core > Issue Type: Bug > Components: modules/analysis >Reporter: Peter Gromov >Priority: Major > Fix For: master (9.0) > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9671) Hunspell: shorten Stemmer.applyAffix
[ https://issues.apache.org/jira/browse/LUCENE-9671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Gromov updated LUCENE-9671: - Issue Type: Improvement (was: Bug) > Hunspell: shorten Stemmer.applyAffix > > > Key: LUCENE-9671 > URL: https://issues.apache.org/jira/browse/LUCENE-9671 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis >Reporter: Peter Gromov >Priority: Major > Fix For: master (9.0) > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9667) Hunspell: add a spellchecker, support BREAK and FORBIDDENWORD affix rules
[ https://issues.apache.org/jira/browse/LUCENE-9667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Gromov updated LUCENE-9667: - Issue Type: Improvement (was: Bug) > Hunspell: add a spellchecker, support BREAK and FORBIDDENWORD affix rules > - > > Key: LUCENE-9667 > URL: https://issues.apache.org/jira/browse/LUCENE-9667 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis >Reporter: Peter Gromov >Priority: Major > Attachments: LUCENE-9667.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Test data taken from hunspell C++, the new code is based on > https://github.com/hunspell/hunspell/blob/master/src/hunspell/hunspell.cxx#L675 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org