[jira] [Commented] (LUCENE-9563) Add .editorConfig
[ https://issues.apache.org/jira/browse/LUCENE-9563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208532#comment-17208532 ] Dawid Weiss commented on LUCENE-9563: - I really don't know about where to put it - I guess we can just version it in the repository? > Add .editorConfig > - > > Key: LUCENE-9563 > URL: https://issues.apache.org/jira/browse/LUCENE-9563 > Project: Lucene - Core > Issue Type: Task >Reporter: David Smiley >Assignee: David Smiley >Priority: Major > > I propose adding a ".editorConfig" to the root of the project. Many text > editors and IDEs support this file to declare code style settings such as > indentation and more. In particular, IntelliJ supports this natively and > Eclipse has a plugin for it. > https://editorconfig.org > I furthermore propose I simply generate this as an export of my current > IntelliJ code style, which is a code style I've been using and was originally > imported from the Lucene's former IntelliJ config. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9564) Format code automatically and enforce it
Dawid Weiss created LUCENE-9564: --- Summary: Format code automatically and enforce it Key: LUCENE-9564 URL: https://issues.apache.org/jira/browse/LUCENE-9564 Project: Lucene - Core Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss This is a trivial change but a bold move. And I'm sure it's not for everyone. I started using google java format [1] in my projects a while ago and have never looked back since. It is an oracle-style formatter (doesn't allow customizations or deviations from the defined 'ideal') - this takes some getting used to - but it also eliminates *all* the potential differences between IDEs, configs, etc. And the formatted code typically looks much better than hand-edited one. It is also verifiable on precommit (so you can't commit code that deviates from what you'd get from automated formatting output). The biggest benefit I see is that refactorings become such a joy and keep the code neat, everywhere. Before you commit you just reformat everything automatically, no matter how much you messed it up. This isn't a change for everyone. I myself love hand-edited, neat code... but the reality is that with IDE support for automated code changes and so many people with different styles working on the same codebase keeping it neat is a big pain. Checkstyle and other tools are fine for ensuring certain rules but they don't take the burden of formatting off your shoulders. This tool does. Like I said - I had *great* reservations about using it at the beginning but over time got so used to it that I almost can't live without it now. It's like magic - you play with the code in any way you like, then run formatting and it's nice and neat. The downside is that automated formatting does imply potential merge problems in backward patches (or any currently existing branches). Like I said, it is a bold move. Just throwing this for your consideration. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss opened a new pull request #1950: LUCENE-9564: add spotless and gjf.
dweiss opened a new pull request #1950: URL: https://github.com/apache/lucene-solr/pull/1950 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9564) Format code automatically and enforce it
[ https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated LUCENE-9564: Description: This is a trivial change but a bold move. And I'm sure it's not for everyone. I started using google java format [1] in my projects a while ago and have never looked back since. It is an oracle-style formatter (doesn't allow customizations or deviations from the defined 'ideal') - this takes some getting used to - but it also eliminates *all* the potential differences between IDEs, configs, etc. And the formatted code typically looks much better than hand-edited one. It is also verifiable on precommit (so you can't commit code that deviates from what you'd get from automated formatting output). The biggest benefit I see is that refactorings become such a joy and keep the code neat, everywhere. Before you commit you just reformat everything automatically, no matter how much you messed it up. This isn't a change for everyone. I myself love hand-edited, neat code... but the reality is that with IDE support for automated code changes and so many people with different styles working on the same codebase keeping it neat is a big pain. Checkstyle and other tools are fine for ensuring certain rules but they don't take the burden of formatting off your shoulders. This tool does. Like I said - I had *great* reservations about using it at the beginning but over time got so used to it that I almost can't live without it now. It's like magic - you play with the code in any way you like, then run formatting and it's nice and neat. The downside is that automated formatting does imply potential merge problems in backward patches (or any currently existing branches). Like I said, it is a bold move. Just throwing this for your consideration. I've added a PR that adds spotless but it's not ready; some files would have to be excluded as they currently violate header rules. A more interesting thing is here where the current code is automatically reformatted, for eyeballing only. https://github.com/dweiss/lucene-solr/commit/80e8f14ca61a13781bc812967a9e38cdbe656041 was: This is a trivial change but a bold move. And I'm sure it's not for everyone. I started using google java format [1] in my projects a while ago and have never looked back since. It is an oracle-style formatter (doesn't allow customizations or deviations from the defined 'ideal') - this takes some getting used to - but it also eliminates *all* the potential differences between IDEs, configs, etc. And the formatted code typically looks much better than hand-edited one. It is also verifiable on precommit (so you can't commit code that deviates from what you'd get from automated formatting output). The biggest benefit I see is that refactorings become such a joy and keep the code neat, everywhere. Before you commit you just reformat everything automatically, no matter how much you messed it up. This isn't a change for everyone. I myself love hand-edited, neat code... but the reality is that with IDE support for automated code changes and so many people with different styles working on the same codebase keeping it neat is a big pain. Checkstyle and other tools are fine for ensuring certain rules but they don't take the burden of formatting off your shoulders. This tool does. Like I said - I had *great* reservations about using it at the beginning but over time got so used to it that I almost can't live without it now. It's like magic - you play with the code in any way you like, then run formatting and it's nice and neat. The downside is that automated formatting does imply potential merge problems in backward patches (or any currently existing branches). Like I said, it is a bold move. Just throwing this for your consideration. > Format code automatically and enforce it > > > Key: LUCENE-9564 > URL: https://issues.apache.org/jira/browse/LUCENE-9564 > Project: Lucene - Core > Issue Type: Task >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Time Spent: 10m > Remaining Estimate: 0h > > This is a trivial change but a bold move. And I'm sure it's not for everyone. > I started using google java format [1] in my projects a while ago and have > never looked back since. It is an oracle-style formatter (doesn't allow > customizations or deviations from the defined 'ideal') - this takes some > getting used to - but it also eliminates *all* the potential differences > between IDEs, configs, etc. And the formatted code typically looks much > better than hand-edited one. It is also verifiable on precommit (so you can't > commit code that deviates from what you'd get from automated formatting > output). > The biggest benefit I see is that refactor
[jira] [Commented] (LUCENE-9493) Remove obsolete dev-tools/{idea,netbeans,maven} folders
[ https://issues.apache.org/jira/browse/LUCENE-9493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208555#comment-17208555 ] Dawid Weiss commented on LUCENE-9493: - > It'd be interesting if the gradle build could detect that IntelliJ is > importing it, first time setup in particular, and then tell the user about > the copyright profile and any other matter. It is possible but it's not documented or officially supported... Look at intellij-idea.gradle. > Remove obsolete dev-tools/{idea,netbeans,maven} folders > --- > > Key: LUCENE-9493 > URL: https://issues.apache.org/jira/browse/LUCENE-9493 > Project: Lucene - Core > Issue Type: Task >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Fix For: master (9.0) > > Time Spent: 20m > Remaining Estimate: 0h > > I don't think they're used or applicable anymore. Thoughts? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on a change in pull request #1905: LUCENE-9488 Release with Gradle Part 2
uschindler commented on a change in pull request #1905: URL: https://github.com/apache/lucene-solr/pull/1905#discussion_r500087063 ## File path: solr/packaging/build.gradle ## @@ -62,12 +63,17 @@ dependencies { example project(path: ":solr:example", configuration: "packaging") server project(path: ":solr:server", configuration: "packaging") + + // Copy files from documentation output + docs project(path: ':solr', configuration: 'docs') Review comment: But where is the documentation config defined. I did not find it anywhere else. I understand and I like it, but this is incomplete. And currently it does not work (I see no docs in ZIP files). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on a change in pull request #1905: LUCENE-9488 Release with Gradle Part 2
uschindler commented on a change in pull request #1905: URL: https://github.com/apache/lucene-solr/pull/1905#discussion_r500088372 ## File path: solr/packaging/build.gradle ## @@ -62,12 +63,17 @@ dependencies { example project(path: ":solr:example", configuration: "packaging") server project(path: ":solr:server", configuration: "packaging") + + // Copy files from documentation output + docs project(path: ':solr', configuration: 'docs') Review comment: So as far as I understand: In the documentation.gradle file I would add a "configuration docs" for `:solr` and `:lucene` and then use `builtBy 'documentation'` there. That's what I am missing here. I was about to set this up yesterday, but was not sure if I miss something. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14870) gradle build does not validate ref-guide -> javadoc links
[ https://issues.apache.org/jira/browse/SOLR-14870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208556#comment-17208556 ] Dawid Weiss commented on SOLR-14870: > The main thing that bogged me down is that the lifecycle of gradle "tasks" > really doesn't seem to make much sense, and there isn't a "clean" guide to > how to refactor a "task" (instance) into a re-usable "class" that many tasks > can be instances of. It does make sense, it's just different from what you're used to. It's a bit like functional programming vs. imperative programming. You can't just "port" one into another. > I tried to start by refactoring 'prepareSources' into a "class PrepareSources > extends Sync" Those tasks are not really meant to be subclassed. They should be composed but the composition is typically based on task dependencies rather than in-code explicit calls. I suspect in this case your "PrepareSources" task is merely configuring the Sync spec - you don't need a specific class, only a task name and configuration of the existing Sync task. Easier explained on a concrete example than in theory but this is my guess. > is there really no better way to get the 'main' classpath from the Project > then the '...getPlugin(JavaPluginConvention.class)...' hoops i jumped through? In general there is no "main" classpath - there are source sets, configurations and a combination of these two. Java plugin creates two "convention" source sets - one for main classes and one for tests but you could have a different setup (and naming). bq. if this is all expected Yes, I think this is expected. This change reflects module names. Simpler = better? I can take a look at the patch later, unless Uwe beats me to it. > gradle build does not validate ref-guide -> javadoc links > - > > Key: SOLR-14870 > URL: https://issues.apache.org/jira/browse/SOLR-14870 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-14870.patch > > > the ant build had (has on 8x) a feature that ensured we didn't have any > broken links between the ref guide and the javadocs... > {code} > depends="javadocs,changes-to-html,process-webpages"> > inheritall="false"> > > > > > {code} > ...by default {{cd solr/solr-ref-guide && ant bare-bones-html-validation}} > just did interanal validation of the strucure of the guide, but this hook > ment that {{cd solr && ant documentation}} (or {{ant precommit}}) would first > build the javadocs; then build the ref-guide; then validate _all_ links i > nthe ref-guide, even those to (local) javadocs > While the "local.javadocs" property logic _inside_ the > solr-ref-guide/build.xml was ported to build.gradle, the logic to leverage > this functionality from the "solr" project doesn't seem to have been > preserved -- so currently, {{gradle check}} doesn't know/care if someone adds > a nonsense javadoc link to the ref-guide (or removes a class/method whose > javadoc is already currently to from the ref guide) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1905: LUCENE-9488 Release with Gradle Part 2
dweiss commented on a change in pull request #1905: URL: https://github.com/apache/lucene-solr/pull/1905#discussion_r500092191 ## File path: solr/packaging/build.gradle ## @@ -62,12 +63,17 @@ dependencies { example project(path: ":solr:example", configuration: "packaging") server project(path: ":solr:server", configuration: "packaging") + + // Copy files from documentation output + docs project(path: ':solr', configuration: 'docs') Review comment: Ah, sorry - yes, you are correct. It should declare artifacts, the configuration to attach them to and the outputs. An example is in solr/example/build.gradle: artifacts { packaging packagingDir, { builtBy assemblePackaging } } This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on a change in pull request #1905: LUCENE-9488 Release with Gradle Part 2
uschindler commented on a change in pull request #1905: URL: https://github.com/apache/lucene-solr/pull/1905#discussion_r500096124 ## File path: solr/packaging/build.gradle ## @@ -62,12 +63,17 @@ dependencies { example project(path: ":solr:example", configuration: "packaging") server project(path: ":solr:server", configuration: "packaging") + + // Copy files from documentation output + docs project(path: ':solr', configuration: 'docs') Review comment: Ah cool. That is what I was missing. I will add this after lunch. I was not sure how to define the artifacts. The Gradle documentation was not very helpful to me. Thanks and sorry for stupid questions. 🤣 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] sigram opened a new pull request #1951: SOLR-14691: Reduce object creation by using MapWriter / IteratorWriter.
sigram opened a new pull request #1951: URL: https://github.com/apache/lucene-solr/pull/1951 This replaces nearly all Map / List usage in `MetricUtils` with `MapWriter` / `IteratorWriter`. The PR includes also changes to `Utils.MapWriterJSONWriter` to similarly avoid creating Map / List instances and instead serialize `MapWriter` / `IteratorWriter` directly. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1905: LUCENE-9488 Release with Gradle Part 2
dweiss commented on a change in pull request #1905: URL: https://github.com/apache/lucene-solr/pull/1905#discussion_r500100437 ## File path: solr/packaging/build.gradle ## @@ -62,12 +63,17 @@ dependencies { example project(path: ":solr:example", configuration: "packaging") server project(path: ":solr:server", configuration: "packaging") + + // Copy files from documentation output + docs project(path: ':solr', configuration: 'docs') Review comment: There are no stupid questions. This is the relevant documentation bit here, I believe. https://docs.gradle.org/current/userguide/cross_project_publications.html This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14691) Metrics reporting should avoid creating objects
[ https://issues.apache.org/jira/browse/SOLR-14691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208572#comment-17208572 ] Andrzej Bialecki commented on SOLR-14691: - PR 1951 replaces all use of Map / List in {{MetricUtils}} and also adds similar improvements to {{MapWriterJSONWriter}} that is used whenever {{Utils.toJsonString(...)}} is used. If there are no objections I'll commit this shortly. > Metrics reporting should avoid creating objects > --- > > Key: SOLR-14691 > URL: https://issues.apache.org/jira/browse/SOLR-14691 > Project: Solr > Issue Type: Improvement > Components: metrics >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Critical > Fix For: 8.7 > > Time Spent: 10m > Remaining Estimate: 0h > > {{MetricUtils}} unnecessarily creates a lot of short-lived objects (maps and > lists). This affects GC, especially since metrics are frequently polled by > clients. We should refactor it to use {{MapWriter}} as much as possible. > Alternatively we could provide our wrappers or subclasses of Codahale metrics > that implement {{MapWriter}}, then a lot of complexity in {{MetricUtils}} > wouldn't be needed at all. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9542) Score returned in search request is original score and not reranked score
[ https://issues.apache.org/jira/browse/LUCENE-9542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208576#comment-17208576 ] Christine Poerschke commented on LUCENE-9542: - Thanks [~krishan1390] for opening this issue with a link into the code! {quote}Score returned in search request is original score and not reranked score post ... {quote} Could you share more details on how to observe and reproduce the issue? I just tried the following steps on {{branch_7_7}} and {{branch_8x}} and it _seemed_ that the reranked score was being returned, perhaps a particular kind of query or re-ranking query is needed to encounter the issue? {code} git checkout $branch # branch_7_7 or branch_8x cd solr ; ant dist server # on master branch use gradle equivalent instead bin/solr start -e techproducts curl 'http://localhost:8983/solr/techproducts/select?rows=3&fl=id,score&q=*:*' curl 'http://localhost:8983/solr/techproducts/select?rows=3&fl=id,score&q=*:*&rq=\{!rerank+reRankQuery=id:VDBDB1A16\}' curl 'http://localhost:8983/solr/techproducts/select?rows=3&fl=id,score&q=*:*&rq=\{!rerank+reRankQuery=id:VDBDB1A16+reRankWeight=1\}' curl 'http://localhost:8983/solr/techproducts/select?rows=3&fl=id,score&q=*:*&rq=\{!rerank+reRankQuery=id:VDBDB1A16+reRankWeight=2\}' curl 'http://localhost:8983/solr/techproducts/select?rows=3&fl=id,score&q=*:*&rq=\{!rerank+reRankQuery=id:VDBDB1A16+reRankWeight=3\}' bin/solr stop {code} > Score returned in search request is original score and not reranked score > - > > Key: LUCENE-9542 > URL: https://issues.apache.org/jira/browse/LUCENE-9542 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 8.0 >Reporter: Krishan >Priority: Major > > Score returned in search request is original score and not reranked score > post the changes in https://issues.apache.org/jira/browse/LUCENE-8412. > Commit - > [https://github.com/apache/lucene-solr/commit/55bfadbce115a825a75686fe0bfe71406bc3ee44#diff-4e354f104ed52bd7f620b0c05ae8467d] > Specifically - > if (cmd.getSort() != null && query instanceof RankQuery == false && > (cmd.getFlags() & GET_SCORES) != 0) { > TopFieldCollector.populateScores(topDocs.scoreDocs, this, query); > } > in SolrIndexSearcher.java recomputes the score but outputs only the original > score and not the reranked score. > > The issue is cmd.getQuery() is a type of RankQuery but the "query" variable > is a boolean query and probably replacing query with cmd.getQuery() should be > the right fix for this so that the score is not overriden for rerank queries > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1905: LUCENE-9488 Release with Gradle Part 2
dweiss commented on a change in pull request #1905: URL: https://github.com/apache/lucene-solr/pull/1905#discussion_r500092191 ## File path: solr/packaging/build.gradle ## @@ -62,12 +63,17 @@ dependencies { example project(path: ":solr:example", configuration: "packaging") server project(path: ":solr:server", configuration: "packaging") + + // Copy files from documentation output + docs project(path: ':solr', configuration: 'docs') Review comment: Ah, sorry - yes, you are correct. It should declare artifacts, the configuration to attach them to and the outputs. An example is in solr/example/build.gradle: ``` artifacts { packaging packagingDir, { builtBy assemblePackaging } } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9564) Format code automatically and enforce it
[ https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208615#comment-17208615 ] Michael McCandless commented on LUCENE-9564: +1 to achieve more consistent code styling, with minimal hassle. How/when would it automatically run? How far from Lucene's code style guidelines (WARNING PDF: [Sun's coding style|http://www.oracle.com/technetwork/java/codeconventions-150003.pdf] except 2-space indent) is this one's? > Format code automatically and enforce it > > > Key: LUCENE-9564 > URL: https://issues.apache.org/jira/browse/LUCENE-9564 > Project: Lucene - Core > Issue Type: Task >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Time Spent: 10m > Remaining Estimate: 0h > > This is a trivial change but a bold move. And I'm sure it's not for everyone. > I started using google java format [1] in my projects a while ago and have > never looked back since. It is an oracle-style formatter (doesn't allow > customizations or deviations from the defined 'ideal') - this takes some > getting used to - but it also eliminates *all* the potential differences > between IDEs, configs, etc. And the formatted code typically looks much > better than hand-edited one. It is also verifiable on precommit (so you can't > commit code that deviates from what you'd get from automated formatting > output). > The biggest benefit I see is that refactorings become such a joy and keep the > code neat, everywhere. Before you commit you just reformat everything > automatically, no matter how much you messed it up. > This isn't a change for everyone. I myself love hand-edited, neat code... but > the reality is that with IDE support for automated code changes and so many > people with different styles working on the same codebase keeping it neat is > a big pain. > Checkstyle and other tools are fine for ensuring certain rules but they don't > take the burden of formatting off your shoulders. This tool does. > Like I said - I had *great* reservations about using it at the beginning but > over time got so used to it that I almost can't live without it now. It's > like magic - you play with the code in any way you like, then run formatting > and it's nice and neat. > The downside is that automated formatting does imply potential merge problems > in backward patches (or any currently existing branches). > Like I said, it is a bold move. Just throwing this for your consideration. > I've added a PR that adds spotless but it's not ready; some files would have > to be excluded as they currently violate header rules. > A more interesting thing is here where the current code is automatically > reformatted, for eyeballing only. > https://github.com/dweiss/lucene-solr/commit/80e8f14ca61a13781bc812967a9e38cdbe656041 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9564) Format code automatically and enforce it
[ https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208625#comment-17208625 ] Dawid Weiss commented on LUCENE-9564: - The check itself could run in precommit. A "cleanup" task is just called 'tidy' - I wouldn't make it automatic; the precommit would fail if the code deviated from what's required. With time it just becomes a habit to call "gradlew tidy precommit"... bq. How far from Lucene's code style guidelines I think it's close to Sun's... but it also handles so many things much better - automation is the key to strict consistency here (spaces around operators, breaking conditionals into multiple lines, if necessary). It also handles new language features well (formats chained calls and closures in a very reasonable way). Take a look at the diff in that commit above - you'll see the before-after. My experience with this thing is that whenever I'm not happy with how it handled something, it is typically my fault (too complex expressions, too long variable names, broken Javadoc). > Format code automatically and enforce it > > > Key: LUCENE-9564 > URL: https://issues.apache.org/jira/browse/LUCENE-9564 > Project: Lucene - Core > Issue Type: Task >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Time Spent: 10m > Remaining Estimate: 0h > > This is a trivial change but a bold move. And I'm sure it's not for everyone. > I started using google java format [1] in my projects a while ago and have > never looked back since. It is an oracle-style formatter (doesn't allow > customizations or deviations from the defined 'ideal') - this takes some > getting used to - but it also eliminates *all* the potential differences > between IDEs, configs, etc. And the formatted code typically looks much > better than hand-edited one. It is also verifiable on precommit (so you can't > commit code that deviates from what you'd get from automated formatting > output). > The biggest benefit I see is that refactorings become such a joy and keep the > code neat, everywhere. Before you commit you just reformat everything > automatically, no matter how much you messed it up. > This isn't a change for everyone. I myself love hand-edited, neat code... but > the reality is that with IDE support for automated code changes and so many > people with different styles working on the same codebase keeping it neat is > a big pain. > Checkstyle and other tools are fine for ensuring certain rules but they don't > take the burden of formatting off your shoulders. This tool does. > Like I said - I had *great* reservations about using it at the beginning but > over time got so used to it that I almost can't live without it now. It's > like magic - you play with the code in any way you like, then run formatting > and it's nice and neat. > The downside is that automated formatting does imply potential merge problems > in backward patches (or any currently existing branches). > Like I said, it is a bold move. Just throwing this for your consideration. > I've added a PR that adds spotless but it's not ready; some files would have > to be excluded as they currently violate header rules. > A more interesting thing is here where the current code is automatically > reformatted, for eyeballing only. > https://github.com/dweiss/lucene-solr/commit/80e8f14ca61a13781bc812967a9e38cdbe656041 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14914) Add option to disable metrics collection
[ https://issues.apache.org/jira/browse/SOLR-14914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki updated SOLR-14914: Fix Version/s: (was: master (9.0)) > Add option to disable metrics collection > > > Key: SOLR-14914 > URL: https://issues.apache.org/jira/browse/SOLR-14914 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: 8.7 > > > Some users have expressed concerns about the overhead of metrics collection, > and consequently the need to have an option to turn off the metrics > collection altogether. > Metrics instrumentation in Solr cannot be itself easily removed or bypassed - > in order to provide fine-grained metrics many code paths had to be changed > and they now expect the metrics to be present (non-null). However, we can use > the mechanism of {{MetricSupplier}} to use no-op implementation of all > metrics, which would reduce the CPU overhead to basically the cost of an > empty method call, and the memory overhead to a HashMap entry in a > {{MetricRegistry}} (metric names still need to be tracked). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14914) Add option to disable metrics collection
Andrzej Bialecki created SOLR-14914: --- Summary: Add option to disable metrics collection Key: SOLR-14914 URL: https://issues.apache.org/jira/browse/SOLR-14914 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Components: metrics Reporter: Andrzej Bialecki Assignee: Andrzej Bialecki Some users have expressed concerns about the overhead of metrics collection, and consequently the need to have an option to turn off the metrics collection altogether. Metrics instrumentation in Solr cannot be itself easily removed or bypassed - in order to provide fine-grained metrics many code paths had to be changed and they now expect the metrics to be present (non-null). However, we can use the mechanism of {{MetricSupplier}} to use no-op implementation of all metrics, which would reduce the CPU overhead to basically the cost of an empty method call, and the memory overhead to a HashMap entry in a {{MetricRegistry}} (metric names still need to be tracked). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14914) Add option to disable metrics collection
[ https://issues.apache.org/jira/browse/SOLR-14914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki updated SOLR-14914: Fix Version/s: master (9.0) > Add option to disable metrics collection > > > Key: SOLR-14914 > URL: https://issues.apache.org/jira/browse/SOLR-14914 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: master (9.0) > > > Some users have expressed concerns about the overhead of metrics collection, > and consequently the need to have an option to turn off the metrics > collection altogether. > Metrics instrumentation in Solr cannot be itself easily removed or bypassed - > in order to provide fine-grained metrics many code paths had to be changed > and they now expect the metrics to be present (non-null). However, we can use > the mechanism of {{MetricSupplier}} to use no-op implementation of all > metrics, which would reduce the CPU overhead to basically the cost of an > empty method call, and the memory overhead to a HashMap entry in a > {{MetricRegistry}} (metric names still need to be tracked). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14914) Add option to disable metrics collection
[ https://issues.apache.org/jira/browse/SOLR-14914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki updated SOLR-14914: Fix Version/s: 8.7 > Add option to disable metrics collection > > > Key: SOLR-14914 > URL: https://issues.apache.org/jira/browse/SOLR-14914 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: master (9.0), 8.7 > > > Some users have expressed concerns about the overhead of metrics collection, > and consequently the need to have an option to turn off the metrics > collection altogether. > Metrics instrumentation in Solr cannot be itself easily removed or bypassed - > in order to provide fine-grained metrics many code paths had to be changed > and they now expect the metrics to be present (non-null). However, we can use > the mechanism of {{MetricSupplier}} to use no-op implementation of all > metrics, which would reduce the CPU overhead to basically the cost of an > empty method call, and the memory overhead to a HashMap entry in a > {{MetricRegistry}} (metric names still need to be tracked). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14914) Add option to disable metrics collection
[ https://issues.apache.org/jira/browse/SOLR-14914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki updated SOLR-14914: Description: Some users have expressed concerns about the overhead of metrics collection, and consequently the need to have an option to turn off the metrics collection altogether. Metrics instrumentation in Solr cannot be itself easily removed or bypassed - in order to provide fine-grained metrics many code paths had to be changed and they now expect the metrics to be present (non-null). However, we can use the mechanism of {{MetricSupplier}} to provide singleton no-op implementations of all metrics, which would reduce the CPU overhead to basically the cost of an empty method call, and the memory overhead to a HashMap entry in a {{MetricRegistry}} (metric names still need to be tracked). was: Some users have expressed concerns about the overhead of metrics collection, and consequently the need to have an option to turn off the metrics collection altogether. Metrics instrumentation in Solr cannot be itself easily removed or bypassed - in order to provide fine-grained metrics many code paths had to be changed and they now expect the metrics to be present (non-null). However, we can use the mechanism of {{MetricSupplier}} to use no-op implementation of all metrics, which would reduce the CPU overhead to basically the cost of an empty method call, and the memory overhead to a HashMap entry in a {{MetricRegistry}} (metric names still need to be tracked). > Add option to disable metrics collection > > > Key: SOLR-14914 > URL: https://issues.apache.org/jira/browse/SOLR-14914 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: 8.7 > > > Some users have expressed concerns about the overhead of metrics collection, > and consequently the need to have an option to turn off the metrics > collection altogether. > Metrics instrumentation in Solr cannot be itself easily removed or bypassed - > in order to provide fine-grained metrics many code paths had to be changed > and they now expect the metrics to be present (non-null). However, we can use > the mechanism of {{MetricSupplier}} to provide singleton no-op > implementations of all metrics, which would reduce the CPU overhead to > basically the cost of an empty method call, and the memory overhead to a > HashMap entry in a {{MetricRegistry}} (metric names still need to be tracked). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9564) Format code automatically and enforce it
[ https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208663#comment-17208663 ] Andrzej Bialecki commented on LUCENE-9564: -- +1 to consistent and enforced formatting. I'm pretty sure we have quite a few mis-formatted files, probably more in Solr than in Lucene, and some devs are notorious for messy formatting - IMHO enforcing this wouldn't be such a bad idea, at least after some some initial period of leniency (initially we could WARN developers and then switch to strict enforcement later). > Format code automatically and enforce it > > > Key: LUCENE-9564 > URL: https://issues.apache.org/jira/browse/LUCENE-9564 > Project: Lucene - Core > Issue Type: Task >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Time Spent: 10m > Remaining Estimate: 0h > > This is a trivial change but a bold move. And I'm sure it's not for everyone. > I started using google java format [1] in my projects a while ago and have > never looked back since. It is an oracle-style formatter (doesn't allow > customizations or deviations from the defined 'ideal') - this takes some > getting used to - but it also eliminates *all* the potential differences > between IDEs, configs, etc. And the formatted code typically looks much > better than hand-edited one. It is also verifiable on precommit (so you can't > commit code that deviates from what you'd get from automated formatting > output). > The biggest benefit I see is that refactorings become such a joy and keep the > code neat, everywhere. Before you commit you just reformat everything > automatically, no matter how much you messed it up. > This isn't a change for everyone. I myself love hand-edited, neat code... but > the reality is that with IDE support for automated code changes and so many > people with different styles working on the same codebase keeping it neat is > a big pain. > Checkstyle and other tools are fine for ensuring certain rules but they don't > take the burden of formatting off your shoulders. This tool does. > Like I said - I had *great* reservations about using it at the beginning but > over time got so used to it that I almost can't live without it now. It's > like magic - you play with the code in any way you like, then run formatting > and it's nice and neat. > The downside is that automated formatting does imply potential merge problems > in backward patches (or any currently existing branches). > Like I said, it is a bold move. Just throwing this for your consideration. > I've added a PR that adds spotless but it's not ready; some files would have > to be excluded as they currently violate header rules. > A more interesting thing is here where the current code is automatically > reformatted, for eyeballing only. > https://github.com/dweiss/lucene-solr/commit/80e8f14ca61a13781bc812967a9e38cdbe656041 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul commented on a change in pull request #1951: SOLR-14691: Reduce object creation by using MapWriter / IteratorWriter.
noblepaul commented on a change in pull request #1951: URL: https://github.com/apache/lucene-solr/pull/1951#discussion_r500206089 ## File path: solr/core/src/java/org/apache/solr/util/stats/MetricUtils.java ## @@ -385,10 +406,10 @@ static double nsToMs(boolean convert, double value) { } // some snapshots represent time in ns, other snapshots represent raw values (eg. chunk size) - static void addSnapshot(Map response, Snapshot snapshot, PropertyFilter propertyFilter, boolean ms) { + static void addSnapshot(MapWriter.EntryWriter ew, Snapshot snapshot, PropertyFilter propertyFilter, boolean ms) { BiConsumer filter = (k, v) -> { if (propertyFilter.accept(k)) { Review comment: I see this pattern repeated in many places. I think a helper method can be added to avoid code duplication This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul commented on pull request #1951: SOLR-14691: Reduce object creation by using MapWriter / IteratorWriter.
noblepaul commented on pull request #1951: URL: https://github.com/apache/lucene-solr/pull/1951#issuecomment-704215473 Can we get rid of the interface `PropertyFilter` and replace it with a `Predicate`? You can use the class `ConditionalKeyMapWriter` to filter `MetricUtils.addMetrics()` : Why can't we just have an anonymous class that implements `MapWriter` and avoid creation of `NamedList` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9564) Format code automatically and enforce it
[ https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208679#comment-17208679 ] Erick Erickson commented on LUCENE-9564: +1 I can adapt to most any style more easily than a dozen different ones. +1 to having it enforced automatically I'd recommend it be as close to the current recommendations as possible of course. All at once or piecemeal? My preference would be all at once, someone volunteers to just do the entire code base in one massive push. Plus, it's easier to say "ignore all diffs in commit 1234, they're all formatting" than deal with them one-by-one. Is there a way to mark something as "leave me alone"? In the diff you linked to, Lucene50SkipReader.java, about line 35 there's a comment that's been reformatted that's certainly better in the older format. Or is there a way to _not_ reformat javadocs, and if so, WDYT about that? This is not a deal-breaker, just askin'. IntelliJ's diff has a nifty "ignore all whitespace" option that would take some of the pain out of comparing a file from after the reformatting with one before the reformatting. That would help some, how much remains to be seen. What timing were you thinking? It'd be a little less difficult for people to make this coincident with releasing 9.0 as there'd be fewer back-porting issues. Again, that's not a deal-breaker. > Format code automatically and enforce it > > > Key: LUCENE-9564 > URL: https://issues.apache.org/jira/browse/LUCENE-9564 > Project: Lucene - Core > Issue Type: Task >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Time Spent: 10m > Remaining Estimate: 0h > > This is a trivial change but a bold move. And I'm sure it's not for everyone. > I started using google java format [1] in my projects a while ago and have > never looked back since. It is an oracle-style formatter (doesn't allow > customizations or deviations from the defined 'ideal') - this takes some > getting used to - but it also eliminates *all* the potential differences > between IDEs, configs, etc. And the formatted code typically looks much > better than hand-edited one. It is also verifiable on precommit (so you can't > commit code that deviates from what you'd get from automated formatting > output). > The biggest benefit I see is that refactorings become such a joy and keep the > code neat, everywhere. Before you commit you just reformat everything > automatically, no matter how much you messed it up. > This isn't a change for everyone. I myself love hand-edited, neat code... but > the reality is that with IDE support for automated code changes and so many > people with different styles working on the same codebase keeping it neat is > a big pain. > Checkstyle and other tools are fine for ensuring certain rules but they don't > take the burden of formatting off your shoulders. This tool does. > Like I said - I had *great* reservations about using it at the beginning but > over time got so used to it that I almost can't live without it now. It's > like magic - you play with the code in any way you like, then run formatting > and it's nice and neat. > The downside is that automated formatting does imply potential merge problems > in backward patches (or any currently existing branches). > Like I said, it is a bold move. Just throwing this for your consideration. > I've added a PR that adds spotless but it's not ready; some files would have > to be excluded as they currently violate header rules. > A more interesting thing is here where the current code is automatically > reformatted, for eyeballing only. > https://github.com/dweiss/lucene-solr/commit/80e8f14ca61a13781bc812967a9e38cdbe656041 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul commented on a change in pull request #1951: SOLR-14691: Reduce object creation by using MapWriter / IteratorWriter.
noblepaul commented on a change in pull request #1951: URL: https://github.com/apache/lucene-solr/pull/1951#discussion_r500216190 ## File path: solr/core/src/test/org/apache/solr/handler/admin/MetricsHandlerTest.java ## @@ -308,19 +318,24 @@ public void testKeyMetrics() throws Exception { handler.handleRequestBody(req(CommonParams.QT, "/admin/metrics", CommonParams.WT, "json", MetricsHandler.KEY_PARAM, key1, MetricsHandler.KEY_PARAM, key2, MetricsHandler.KEY_PARAM, key3), resp); values = resp.getValues(); -val = values.findRecursive("metrics", key1); +map = new HashMap<>(); +values.toMap(map); +val = Utils.getObjectByPath(map, false, "metrics/" + key1); assertNotNull(val); -val = values.findRecursive("metrics", key2); +val = Utils.getObjectByPath(map, true, "metrics/" + key2); assertNotNull(val); -val = values.findRecursive("metrics", key3); +val = Utils.getObjectByPath(map, true, "metrics/" + key3); assertNotNull(val); String key4 = "solr.core.collection1:QUERY./select.requestTimes:1minRate"; resp = new SolrQueryResponse(); handler.handleRequestBody(req(CommonParams.QT, "/admin/metrics", CommonParams.WT, "json", MetricsHandler.KEY_PARAM, key4), resp); values = resp.getValues(); -val = values.findRecursive("metrics", key4); +map = new HashMap<>(); Review comment: There is no need to convert this to `Map`. `values._get()` can fetch nested values This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova edited a comment on pull request #1943: LUCENE-9555 Advance conjuction Iterator for two phase iteration
mayya-sharipova edited a comment on pull request #1943: URL: https://github.com/apache/lucene-solr/pull/1943#issuecomment-703885914 @jpountz Sorry for the noise, I have found the cause of this error, and the latest commit addresses it. Basically this PR will just address the failing test of `TestUnifiedHighlighterStrictPhrases.testBasics`. My next steps will the following: Plan A: - use leap-frog logic without using ConjunctionDISI for this method (a separate PR for that) - Then reintroduce again checks in [ConjunctionDISI](https://github.com/apache/lucene-solr/pull/1937) that I reverted, as sort optimization by introducing new iterators in the middle of iteration would break these checks. Plan B: - continue to use ConjunctionDISI to combine scorerIterator and collectorIterator - from checks in [ConjunctionDISI](https://github.com/apache/lucene-solr/pull/1937) keep only the check on the same doc during constructor and remove all other checks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova edited a comment on pull request #1943: LUCENE-9555 Advance conjuction Iterator for two phase iteration
mayya-sharipova edited a comment on pull request #1943: URL: https://github.com/apache/lucene-solr/pull/1943#issuecomment-703885914 @jpountz Sorry for the noise, I have found the cause of this error, and the latest commit addresses it. Basically this PR will just address the failing test of `TestUnifiedHighlighterStrictPhrases.testBasics`. My next steps will the following: Plan A: - use leap-frog logic without using ConjunctionDISI for this method (a separate PR for that) - Then reintroduce again checks in [ConjunctionDISI](https://github.com/apache/lucene-solr/pull/1937) that I reverted Plan B: - continue to use ConjunctionDISI to combine scorerIterator and collectorIterator - from checks in [ConjunctionDISI](https://github.com/apache/lucene-solr/pull/1937) keep only the check on the same doc during constructor and remove all other checks, as sort optimization by introducing new iterators in the middle of iteration would break these checks. I am interested in your opinion which plan is better? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova edited a comment on pull request #1943: LUCENE-9555 Advance conjuction Iterator for two phase iteration
mayya-sharipova edited a comment on pull request #1943: URL: https://github.com/apache/lucene-solr/pull/1943#issuecomment-703885914 @jpountz Sorry for the noise, I have found the cause of this error, and the latest commit addresses it. Basically this PR will just address the failing test of `TestUnifiedHighlighterStrictPhrases.testBasics`. My next steps will the following: Plan A: - use leap-frog logic without using ConjunctionDISI for this method (a separate PR for that) - Then reintroduce again checks in [ConjunctionDISI](https://github.com/apache/lucene-solr/pull/1937) that I reverted Plan B: - continue to use ConjunctionDISI to combine scorerIterator and collectorIterator - from checks in [ConjunctionDISI](https://github.com/apache/lucene-solr/pull/1937) keep only the check on the same doc during constructor and remove all other checks, as sort optimization by introducing new DISI (from -1 doc) in the middle of iteration would break these checks. I am interested in your opinion which plan is better? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova edited a comment on pull request #1943: LUCENE-9555 Advance conjuction Iterator for two phase iteration
mayya-sharipova edited a comment on pull request #1943: URL: https://github.com/apache/lucene-solr/pull/1943#issuecomment-703885914 @jpountz Sorry for the noise, I have found the cause of this error, and the latest commit addresses it. Basically this PR will just address the failing test of `TestUnifiedHighlighterStrictPhrases.testBasics`. My next steps will the following: Plan A: - use leap-frog logic without using ConjunctionDISI for this method (a separate PR for that) - Then reintroduce again checks in [ConjunctionDISI](https://github.com/apache/lucene-solr/pull/1937) that I reverted Plan B: - continue to use ConjunctionDISI to combine scorerIterator and collectorIterator - from checks in [ConjunctionDISI](https://github.com/apache/lucene-solr/pull/1937) keep only the check of subiterators the same doc during constructor and remove all other checks, as sort optimization by introducing new DISI (from -1 doc) in the middle of iteration would break these checks. I am interested in your opinion which plan is better? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul commented on pull request #1951: SOLR-14691: Reduce object creation by using MapWriter / IteratorWriter.
noblepaul commented on pull request #1951: URL: https://github.com/apache/lucene-solr/pull/1951#issuecomment-704237134 Is it possible to change `MetricsMap implements Gauge>` to ` MetricsMap implements Gauge` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9564) Format code automatically and enforce it
[ https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208702#comment-17208702 ] Dawid Weiss commented on LUCENE-9564: - > I'd recommend it be as close to the current recommendations as possible of > course. Like I said, it's not possible - this is not a configurable format, you accept it as it is. > Is there a way to mark something as "leave me alone"? No, there isn't (on purpose). Javadocs - I'm not sure but again - when you look at the diff, seems better formatted than not (and those places that are screwed up typically required pre tag anyway. You can exclude entire files from this check - this could be applied to generated files... although it really doesn't matter much if you do format them (they'll just look nicer) because they'll be consistently cleaned up after regeneration. > Format code automatically and enforce it > > > Key: LUCENE-9564 > URL: https://issues.apache.org/jira/browse/LUCENE-9564 > Project: Lucene - Core > Issue Type: Task >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Time Spent: 10m > Remaining Estimate: 0h > > This is a trivial change but a bold move. And I'm sure it's not for everyone. > I started using google java format [1] in my projects a while ago and have > never looked back since. It is an oracle-style formatter (doesn't allow > customizations or deviations from the defined 'ideal') - this takes some > getting used to - but it also eliminates *all* the potential differences > between IDEs, configs, etc. And the formatted code typically looks much > better than hand-edited one. It is also verifiable on precommit (so you can't > commit code that deviates from what you'd get from automated formatting > output). > The biggest benefit I see is that refactorings become such a joy and keep the > code neat, everywhere. Before you commit you just reformat everything > automatically, no matter how much you messed it up. > This isn't a change for everyone. I myself love hand-edited, neat code... but > the reality is that with IDE support for automated code changes and so many > people with different styles working on the same codebase keeping it neat is > a big pain. > Checkstyle and other tools are fine for ensuring certain rules but they don't > take the burden of formatting off your shoulders. This tool does. > Like I said - I had *great* reservations about using it at the beginning but > over time got so used to it that I almost can't live without it now. It's > like magic - you play with the code in any way you like, then run formatting > and it's nice and neat. > The downside is that automated formatting does imply potential merge problems > in backward patches (or any currently existing branches). > Like I said, it is a bold move. Just throwing this for your consideration. > I've added a PR that adds spotless but it's not ready; some files would have > to be excluded as they currently violate header rules. > A more interesting thing is here where the current code is automatically > reformatted, for eyeballing only. > https://github.com/dweiss/lucene-solr/commit/80e8f14ca61a13781bc812967a9e38cdbe656041 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9564) Format code automatically and enforce it
[ https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208720#comment-17208720 ] Dawid Weiss commented on LUCENE-9564: - Google's style guide: https://google.github.io/styleguide/javaguide.html Formatter itself (interesting code): https://github.com/google/google-java-format > Format code automatically and enforce it > > > Key: LUCENE-9564 > URL: https://issues.apache.org/jira/browse/LUCENE-9564 > Project: Lucene - Core > Issue Type: Task >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Time Spent: 10m > Remaining Estimate: 0h > > This is a trivial change but a bold move. And I'm sure it's not for everyone. > I started using google java format [1] in my projects a while ago and have > never looked back since. It is an oracle-style formatter (doesn't allow > customizations or deviations from the defined 'ideal') - this takes some > getting used to - but it also eliminates *all* the potential differences > between IDEs, configs, etc. And the formatted code typically looks much > better than hand-edited one. It is also verifiable on precommit (so you can't > commit code that deviates from what you'd get from automated formatting > output). > The biggest benefit I see is that refactorings become such a joy and keep the > code neat, everywhere. Before you commit you just reformat everything > automatically, no matter how much you messed it up. > This isn't a change for everyone. I myself love hand-edited, neat code... but > the reality is that with IDE support for automated code changes and so many > people with different styles working on the same codebase keeping it neat is > a big pain. > Checkstyle and other tools are fine for ensuring certain rules but they don't > take the burden of formatting off your shoulders. This tool does. > Like I said - I had *great* reservations about using it at the beginning but > over time got so used to it that I almost can't live without it now. It's > like magic - you play with the code in any way you like, then run formatting > and it's nice and neat. > The downside is that automated formatting does imply potential merge problems > in backward patches (or any currently existing branches). > Like I said, it is a bold move. Just throwing this for your consideration. > I've added a PR that adds spotless but it's not ready; some files would have > to be excluded as they currently violate header rules. > A more interesting thing is here where the current code is automatically > reformatted, for eyeballing only. > https://github.com/dweiss/lucene-solr/commit/80e8f14ca61a13781bc812967a9e38cdbe656041 > [1] https://google.github.io/styleguide/javaguide.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9564) Format code automatically and enforce it
[ https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated LUCENE-9564: Description: This is a trivial change but a bold move. And I'm sure it's not for everyone. I started using google java format [1] in my projects a while ago and have never looked back since. It is an oracle-style formatter (doesn't allow customizations or deviations from the defined 'ideal') - this takes some getting used to - but it also eliminates *all* the potential differences between IDEs, configs, etc. And the formatted code typically looks much better than hand-edited one. It is also verifiable on precommit (so you can't commit code that deviates from what you'd get from automated formatting output). The biggest benefit I see is that refactorings become such a joy and keep the code neat, everywhere. Before you commit you just reformat everything automatically, no matter how much you messed it up. This isn't a change for everyone. I myself love hand-edited, neat code... but the reality is that with IDE support for automated code changes and so many people with different styles working on the same codebase keeping it neat is a big pain. Checkstyle and other tools are fine for ensuring certain rules but they don't take the burden of formatting off your shoulders. This tool does. Like I said - I had *great* reservations about using it at the beginning but over time got so used to it that I almost can't live without it now. It's like magic - you play with the code in any way you like, then run formatting and it's nice and neat. The downside is that automated formatting does imply potential merge problems in backward patches (or any currently existing branches). Like I said, it is a bold move. Just throwing this for your consideration. I've added a PR that adds spotless but it's not ready; some files would have to be excluded as they currently violate header rules. A more interesting thing is here where the current code is automatically reformatted, for eyeballing only. https://github.com/dweiss/lucene-solr/commit/80e8f14ca61a13781bc812967a9e38cdbe656041 [1] https://google.github.io/styleguide/javaguide.html was: This is a trivial change but a bold move. And I'm sure it's not for everyone. I started using google java format [1] in my projects a while ago and have never looked back since. It is an oracle-style formatter (doesn't allow customizations or deviations from the defined 'ideal') - this takes some getting used to - but it also eliminates *all* the potential differences between IDEs, configs, etc. And the formatted code typically looks much better than hand-edited one. It is also verifiable on precommit (so you can't commit code that deviates from what you'd get from automated formatting output). The biggest benefit I see is that refactorings become such a joy and keep the code neat, everywhere. Before you commit you just reformat everything automatically, no matter how much you messed it up. This isn't a change for everyone. I myself love hand-edited, neat code... but the reality is that with IDE support for automated code changes and so many people with different styles working on the same codebase keeping it neat is a big pain. Checkstyle and other tools are fine for ensuring certain rules but they don't take the burden of formatting off your shoulders. This tool does. Like I said - I had *great* reservations about using it at the beginning but over time got so used to it that I almost can't live without it now. It's like magic - you play with the code in any way you like, then run formatting and it's nice and neat. The downside is that automated formatting does imply potential merge problems in backward patches (or any currently existing branches). Like I said, it is a bold move. Just throwing this for your consideration. I've added a PR that adds spotless but it's not ready; some files would have to be excluded as they currently violate header rules. A more interesting thing is here where the current code is automatically reformatted, for eyeballing only. https://github.com/dweiss/lucene-solr/commit/80e8f14ca61a13781bc812967a9e38cdbe656041 > Format code automatically and enforce it > > > Key: LUCENE-9564 > URL: https://issues.apache.org/jira/browse/LUCENE-9564 > Project: Lucene - Core > Issue Type: Task >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Time Spent: 10m > Remaining Estimate: 0h > > This is a trivial change but a bold move. And I'm sure it's not for everyone. > I started using google java format [1] in my projects a while ago and have > never looked back since. It is an oracle-style formatter (doesn't allow > customizations or deviations from the defined 'idea
[jira] [Updated] (LUCENE-9555) Advance conjunction Iterator for two phase iteration
[ https://issues.apache.org/jira/browse/LUCENE-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayya Sharipova updated LUCENE-9555: Summary: Advance conjunction Iterator for two phase iteration (was: Sort optimization failure if scorerIterator is already advanced) > Advance conjunction Iterator for two phase iteration > > > Key: LUCENE-9555 > URL: https://issues.apache.org/jira/browse/LUCENE-9555 > Project: Lucene - Core > Issue Type: Bug >Reporter: Mayya Sharipova >Priority: Minor > Time Spent: 1h 20m > Remaining Estimate: 0h > > Some collectors provide iterators that can efficiently skip non-competitive > docs. When using DefaultBulkScorer#score function we create a conjunction of > scorerIterator and collectorIterator. The problem could be if scorerIterator > has already been advanced. As collectorIterator always starts from a docID = > -1, and for creation of conjunction iterator we need all of its > sub-iterators to be on the same doc, the creation of conjunction iterator > will fail. > We need to create a conjunction between scorerIterator and collectorIterator > only if scorerIterator has not been advanced yet. > Relates to https://issues.apache.org/jira/browse/LUCENE-9280 > Relates to https://issues.apache.org/jira/browse/LUCENE-9541 > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9555) Advance conjunction Iterator for two phase iteration
[ https://issues.apache.org/jira/browse/LUCENE-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayya Sharipova updated LUCENE-9555: Description: LUCENE-9280 introduced a sort optimization where documents can be skipped. But there was a bug in case we were using two phase approximation, as we would advance it without advancing an overall conjunction iterator. Relates to https://issues.apache.org/jira/browse/LUCENE-9280 was: Some collectors provide iterators that can efficiently skip non-competitive docs. When using DefaultBulkScorer#score function we create a conjunction of scorerIterator and collectorIterator. The problem could be if scorerIterator has already been advanced. As collectorIterator always starts from a docID = -1, and for creation of conjunction iterator we need all of its sub-iterators to be on the same doc, the creation of conjunction iterator will fail. We need to create a conjunction between scorerIterator and collectorIterator only if scorerIterator has not been advanced yet. Relates to https://issues.apache.org/jira/browse/LUCENE-9280 Relates to https://issues.apache.org/jira/browse/LUCENE-9541 > Advance conjunction Iterator for two phase iteration > > > Key: LUCENE-9555 > URL: https://issues.apache.org/jira/browse/LUCENE-9555 > Project: Lucene - Core > Issue Type: Bug >Reporter: Mayya Sharipova >Priority: Minor > Time Spent: 1h 20m > Remaining Estimate: 0h > > LUCENE-9280 introduced a sort optimization where > documents can be skipped. > But there was a bug in case we were using two phase > approximation, as we would advance it without advancing > an overall conjunction iterator. > Relates to https://issues.apache.org/jira/browse/LUCENE-9280 > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9564) Format code automatically and enforce it
[ https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208739#comment-17208739 ] Erick Erickson commented on LUCENE-9564: Ah, so you're saying that the bits I pointed out would be left alone if they have a tag? That fixes that problem if so. > Format code automatically and enforce it > > > Key: LUCENE-9564 > URL: https://issues.apache.org/jira/browse/LUCENE-9564 > Project: Lucene - Core > Issue Type: Task >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Time Spent: 10m > Remaining Estimate: 0h > > This is a trivial change but a bold move. And I'm sure it's not for everyone. > I started using google java format [1] in my projects a while ago and have > never looked back since. It is an oracle-style formatter (doesn't allow > customizations or deviations from the defined 'ideal') - this takes some > getting used to - but it also eliminates *all* the potential differences > between IDEs, configs, etc. And the formatted code typically looks much > better than hand-edited one. It is also verifiable on precommit (so you can't > commit code that deviates from what you'd get from automated formatting > output). > The biggest benefit I see is that refactorings become such a joy and keep the > code neat, everywhere. Before you commit you just reformat everything > automatically, no matter how much you messed it up. > This isn't a change for everyone. I myself love hand-edited, neat code... but > the reality is that with IDE support for automated code changes and so many > people with different styles working on the same codebase keeping it neat is > a big pain. > Checkstyle and other tools are fine for ensuring certain rules but they don't > take the burden of formatting off your shoulders. This tool does. > Like I said - I had *great* reservations about using it at the beginning but > over time got so used to it that I almost can't live without it now. It's > like magic - you play with the code in any way you like, then run formatting > and it's nice and neat. > The downside is that automated formatting does imply potential merge problems > in backward patches (or any currently existing branches). > Like I said, it is a bold move. Just throwing this for your consideration. > I've added a PR that adds spotless but it's not ready; some files would have > to be excluded as they currently violate header rules. > A more interesting thing is here where the current code is automatically > reformatted, for eyeballing only. > https://github.com/dweiss/lucene-solr/commit/80e8f14ca61a13781bc812967a9e38cdbe656041 > [1] https://google.github.io/styleguide/javaguide.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9555) Advance conjunction Iterator for two phase iteration
[ https://issues.apache.org/jira/browse/LUCENE-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208741#comment-17208741 ] ASF subversion and git services commented on LUCENE-9555: - Commit 6ac94a6f9fba724b00bc886be2b1620d8db47d83 in lucene-solr's branch refs/heads/master from Mayya Sharipova [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=6ac94a6 ] LUCENE-9555: Advance conjunction Iterator for two phase iteration (#1943) PR #1351 introduced a sort optimization where documents can be skipped. But there was a bug in case we were using two phase approximation, as we would advance it without advancing an overall conjunction iterator. This patch fixed it. Relates to #1351 > Advance conjunction Iterator for two phase iteration > > > Key: LUCENE-9555 > URL: https://issues.apache.org/jira/browse/LUCENE-9555 > Project: Lucene - Core > Issue Type: Bug >Reporter: Mayya Sharipova >Priority: Minor > Time Spent: 1.5h > Remaining Estimate: 0h > > LUCENE-9280 introduced a sort optimization where > documents can be skipped. > But there was a bug in case we were using two phase > approximation, as we would advance it without advancing > an overall conjunction iterator. > Relates to https://issues.apache.org/jira/browse/LUCENE-9280 > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9564) Format code automatically and enforce it
[ https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208763#comment-17208763 ] Dawid Weiss commented on LUCENE-9564: - Ideally, they should be wrapped in code blocks too -- see here: https://reflectoring.io/howto-format-code-snippets-in-javadoc/#pre--code > Format code automatically and enforce it > > > Key: LUCENE-9564 > URL: https://issues.apache.org/jira/browse/LUCENE-9564 > Project: Lucene - Core > Issue Type: Task >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Time Spent: 10m > Remaining Estimate: 0h > > This is a trivial change but a bold move. And I'm sure it's not for everyone. > I started using google java format [1] in my projects a while ago and have > never looked back since. It is an oracle-style formatter (doesn't allow > customizations or deviations from the defined 'ideal') - this takes some > getting used to - but it also eliminates *all* the potential differences > between IDEs, configs, etc. And the formatted code typically looks much > better than hand-edited one. It is also verifiable on precommit (so you can't > commit code that deviates from what you'd get from automated formatting > output). > The biggest benefit I see is that refactorings become such a joy and keep the > code neat, everywhere. Before you commit you just reformat everything > automatically, no matter how much you messed it up. > This isn't a change for everyone. I myself love hand-edited, neat code... but > the reality is that with IDE support for automated code changes and so many > people with different styles working on the same codebase keeping it neat is > a big pain. > Checkstyle and other tools are fine for ensuring certain rules but they don't > take the burden of formatting off your shoulders. This tool does. > Like I said - I had *great* reservations about using it at the beginning but > over time got so used to it that I almost can't live without it now. It's > like magic - you play with the code in any way you like, then run formatting > and it's nice and neat. > The downside is that automated formatting does imply potential merge problems > in backward patches (or any currently existing branches). > Like I said, it is a bold move. Just throwing this for your consideration. > I've added a PR that adds spotless but it's not ready; some files would have > to be excluded as they currently violate header rules. > A more interesting thing is here where the current code is automatically > reformatted, for eyeballing only. > https://github.com/dweiss/lucene-solr/commit/80e8f14ca61a13781bc812967a9e38cdbe656041 > [1] https://google.github.io/styleguide/javaguide.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9555) Advance conjunction Iterator for two phase iteration
[ https://issues.apache.org/jira/browse/LUCENE-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208752#comment-17208752 ] ASF subversion and git services commented on LUCENE-9555: - Commit e7bf2dc8b324b65b384904959284aef9630ef761 in lucene-solr's branch refs/heads/branch_8x from Mayya Sharipova [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=e7bf2dc ] LUCENE-9555: Advance conjunction Iterator for two phase iteration (#1943) PR #1351 introduced a sort optimization where documents can be skipped. But there was a bug in case we were using two phase approximation, as we would advance it without advancing an overall conjunction iterator. This patch fixed it. Relates to #1351 > Advance conjunction Iterator for two phase iteration > > > Key: LUCENE-9555 > URL: https://issues.apache.org/jira/browse/LUCENE-9555 > Project: Lucene - Core > Issue Type: Bug >Reporter: Mayya Sharipova >Priority: Minor > Time Spent: 1.5h > Remaining Estimate: 0h > > LUCENE-9280 introduced a sort optimization where > documents can be skipped. > But there was a bug in case we were using two phase > approximation, as we would advance it without advancing > an overall conjunction iterator. > Relates to https://issues.apache.org/jira/browse/LUCENE-9280 > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9565) Fix iteration over competitive iterators
Mayya Sharipova created LUCENE-9565: --- Summary: Fix iteration over competitive iterators Key: LUCENE-9565 URL: https://issues.apache.org/jira/browse/LUCENE-9565 Project: Lucene - Core Issue Type: Bug Reporter: Mayya Sharipova LUCENE-9280 introduced a sort optimization where documents can be skipped. But iteration over competitive iterators was not properly organized, as they were not storing the current docID, and when competitive iterator was updated the current doc ID was lost. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova opened a new pull request #1952: LUCENE-9565 Fix competitive iteration
mayya-sharipova opened a new pull request #1952: URL: https://github.com/apache/lucene-solr/pull/1952 PR #1351 introduced a sort optimization where documents can be skipped. But iteration over competitive iterators was not properly organized, as they were not storing the current docID, and when competitive iterator was updated the current doc ID was lost. This patch fixed it. Relates to #1351 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova commented on pull request #1952: LUCENE-9565 Fix competitive iteration
mayya-sharipova commented on pull request #1952: URL: https://github.com/apache/lucene-solr/pull/1952#issuecomment-704311136 This patch works well with checks in [ConjunctionDISI](https://github.com/apache/lucene-solr/pull/1937), and all tests pass as well. After we merge this PR, we can reintroduce the checks in ConjunctionDISI. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on a change in pull request #1952: LUCENE-9565 Fix competitive iteration
jpountz commented on a change in pull request #1952: URL: https://github.com/apache/lucene-solr/pull/1952#discussion_r500372658 ## File path: lucene/core/src/java/org/apache/lucene/search/Weight.java ## @@ -204,9 +204,14 @@ public int score(LeafCollector collector, Bits acceptDocs, int min, int max) thr collector.setScorer(scorer); DocIdSetIterator scorerIterator = twoPhase == null ? iterator : twoPhase.approximation(); DocIdSetIterator collectorIterator = collector.competitiveIterator(); - // if possible filter scorerIterator to keep only competitive docs as defined by collector - DocIdSetIterator filteredIterator = collectorIterator == null ? scorerIterator : - ConjunctionDISI.intersectIterators(Arrays.asList(scorerIterator, collectorIterator)); + DocIdSetIterator filteredIterator = scorerIterator; + if (collectorIterator != null) { +if (scorerIterator.docID() != -1) { + collectorIterator.advance(scorerIterator.docID()); +} Review comment: I don't think that this is good enough as we might be advancing ahead of scorerIterator? This was why I thought that we should instead wrap scorerIterator in such a way that its initial docID would be -1. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9564) Format code automatically and enforce it
[ https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208840#comment-17208840 ] Varun Thacker commented on LUCENE-9564: --- +1 to consistent and enforced formatting. The one downside is git blame will now show the commit that reformatted the code in all these files. > Format code automatically and enforce it > > > Key: LUCENE-9564 > URL: https://issues.apache.org/jira/browse/LUCENE-9564 > Project: Lucene - Core > Issue Type: Task >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Time Spent: 10m > Remaining Estimate: 0h > > This is a trivial change but a bold move. And I'm sure it's not for everyone. > I started using google java format [1] in my projects a while ago and have > never looked back since. It is an oracle-style formatter (doesn't allow > customizations or deviations from the defined 'ideal') - this takes some > getting used to - but it also eliminates *all* the potential differences > between IDEs, configs, etc. And the formatted code typically looks much > better than hand-edited one. It is also verifiable on precommit (so you can't > commit code that deviates from what you'd get from automated formatting > output). > The biggest benefit I see is that refactorings become such a joy and keep the > code neat, everywhere. Before you commit you just reformat everything > automatically, no matter how much you messed it up. > This isn't a change for everyone. I myself love hand-edited, neat code... but > the reality is that with IDE support for automated code changes and so many > people with different styles working on the same codebase keeping it neat is > a big pain. > Checkstyle and other tools are fine for ensuring certain rules but they don't > take the burden of formatting off your shoulders. This tool does. > Like I said - I had *great* reservations about using it at the beginning but > over time got so used to it that I almost can't live without it now. It's > like magic - you play with the code in any way you like, then run formatting > and it's nice and neat. > The downside is that automated formatting does imply potential merge problems > in backward patches (or any currently existing branches). > Like I said, it is a bold move. Just throwing this for your consideration. > I've added a PR that adds spotless but it's not ready; some files would have > to be excluded as they currently violate header rules. > A more interesting thing is here where the current code is automatically > reformatted, for eyeballing only. > https://github.com/dweiss/lucene-solr/commit/80e8f14ca61a13781bc812967a9e38cdbe656041 > [1] https://google.github.io/styleguide/javaguide.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9564) Format code automatically and enforce it
[ https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208845#comment-17208845 ] Dawid Weiss commented on LUCENE-9564: - Yes, you're correct. I don't know if there is a way around this - very likely no. It's a trivial change but it does have big side effects (and pros and cons). We don't have to do it, I just wanted to let you know about this option. I've been *really* happy with this automated-code-formatting workflow and it's been a few months now. > Format code automatically and enforce it > > > Key: LUCENE-9564 > URL: https://issues.apache.org/jira/browse/LUCENE-9564 > Project: Lucene - Core > Issue Type: Task >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Time Spent: 10m > Remaining Estimate: 0h > > This is a trivial change but a bold move. And I'm sure it's not for everyone. > I started using google java format [1] in my projects a while ago and have > never looked back since. It is an oracle-style formatter (doesn't allow > customizations or deviations from the defined 'ideal') - this takes some > getting used to - but it also eliminates *all* the potential differences > between IDEs, configs, etc. And the formatted code typically looks much > better than hand-edited one. It is also verifiable on precommit (so you can't > commit code that deviates from what you'd get from automated formatting > output). > The biggest benefit I see is that refactorings become such a joy and keep the > code neat, everywhere. Before you commit you just reformat everything > automatically, no matter how much you messed it up. > This isn't a change for everyone. I myself love hand-edited, neat code... but > the reality is that with IDE support for automated code changes and so many > people with different styles working on the same codebase keeping it neat is > a big pain. > Checkstyle and other tools are fine for ensuring certain rules but they don't > take the burden of formatting off your shoulders. This tool does. > Like I said - I had *great* reservations about using it at the beginning but > over time got so used to it that I almost can't live without it now. It's > like magic - you play with the code in any way you like, then run formatting > and it's nice and neat. > The downside is that automated formatting does imply potential merge problems > in backward patches (or any currently existing branches). > Like I said, it is a bold move. Just throwing this for your consideration. > I've added a PR that adds spotless but it's not ready; some files would have > to be excluded as they currently violate header rules. > A more interesting thing is here where the current code is automatically > reformatted, for eyeballing only. > https://github.com/dweiss/lucene-solr/commit/80e8f14ca61a13781bc812967a9e38cdbe656041 > [1] https://google.github.io/styleguide/javaguide.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9564) Format code automatically and enforce it
[ https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated LUCENE-9564: Description: This is a trivial change but a bold move. And I'm sure it's not for everyone. I started using google java format [1] in my projects a while ago and have never looked back since. It is an oracle-style formatter (doesn't allow customizations or deviations from the defined 'ideal') - this takes some getting used to - but it also eliminates *all* the potential differences between IDEs, configs, etc. And the formatted code typically looks much better than hand-edited one. It is also verifiable on precommit (so you can't commit code that deviates from what you'd get from automated formatting output). The biggest benefit I see is that refactorings become such a joy and keep the code neat, everywhere. Before you commit you just reformat everything automatically, no matter how much you messed it up. This isn't a change for everyone. I myself love hand-edited, neat code... but the reality is that with IDE support for automated code changes and so many people with different styles working on the same codebase keeping it neat is a big pain. Checkstyle and other tools are fine for ensuring certain rules but they don't take the burden of formatting off your shoulders. This tool does. Like I said - I had *great* reservations about using it at the beginning but over time got so used to it that I almost can't live without it now. It's like magic - you play with the code in any way you like, then run formatting and it's nice and neat. The downside is that automated formatting does imply potential merge problems in backward patches (or any currently existing branches). Like I said, it is a bold move. Just throwing this for your consideration. -I've added a PR that adds spotless but it's not ready; some files would have to be excluded as they currently violate header rules.- A more interesting thing is here where the current code is automatically reformatted - this branch is for eyeballing only. https://github.com/dweiss/lucene-solr/compare/LUCENE-9564...dweiss:LUCENE-9564-example [1] https://google.github.io/styleguide/javaguide.html was: This is a trivial change but a bold move. And I'm sure it's not for everyone. I started using google java format [1] in my projects a while ago and have never looked back since. It is an oracle-style formatter (doesn't allow customizations or deviations from the defined 'ideal') - this takes some getting used to - but it also eliminates *all* the potential differences between IDEs, configs, etc. And the formatted code typically looks much better than hand-edited one. It is also verifiable on precommit (so you can't commit code that deviates from what you'd get from automated formatting output). The biggest benefit I see is that refactorings become such a joy and keep the code neat, everywhere. Before you commit you just reformat everything automatically, no matter how much you messed it up. This isn't a change for everyone. I myself love hand-edited, neat code... but the reality is that with IDE support for automated code changes and so many people with different styles working on the same codebase keeping it neat is a big pain. Checkstyle and other tools are fine for ensuring certain rules but they don't take the burden of formatting off your shoulders. This tool does. Like I said - I had *great* reservations about using it at the beginning but over time got so used to it that I almost can't live without it now. It's like magic - you play with the code in any way you like, then run formatting and it's nice and neat. The downside is that automated formatting does imply potential merge problems in backward patches (or any currently existing branches). Like I said, it is a bold move. Just throwing this for your consideration. I've added a PR that adds spotless but it's not ready; some files would have to be excluded as they currently violate header rules. A more interesting thing is here where the current code is automatically reformatted, for eyeballing only. https://github.com/dweiss/lucene-solr/commit/80e8f14ca61a13781bc812967a9e38cdbe656041 [1] https://google.github.io/styleguide/javaguide.html > Format code automatically and enforce it > > > Key: LUCENE-9564 > URL: https://issues.apache.org/jira/browse/LUCENE-9564 > Project: Lucene - Core > Issue Type: Task >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Time Spent: 10m > Remaining Estimate: 0h > > This is a trivial change but a bold move. And I'm sure it's not for everyone. > I started using google java format [1] in my projects a while ago and have > never looked back since. It is an oracle-style form
[jira] [Commented] (SOLR-14870) gradle build does not validate ref-guide -> javadoc links
[ https://issues.apache.org/jira/browse/SOLR-14870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208856#comment-17208856 ] Dawid Weiss commented on SOLR-14870: I took a peek. Is there any particular reason why you wanted to make this PrepareSources a java task and not just use DSL, Chris? > gradle build does not validate ref-guide -> javadoc links > - > > Key: SOLR-14870 > URL: https://issues.apache.org/jira/browse/SOLR-14870 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-14870.patch > > > the ant build had (has on 8x) a feature that ensured we didn't have any > broken links between the ref guide and the javadocs... > {code} > depends="javadocs,changes-to-html,process-webpages"> > inheritall="false"> > > > > > {code} > ...by default {{cd solr/solr-ref-guide && ant bare-bones-html-validation}} > just did interanal validation of the strucure of the guide, but this hook > ment that {{cd solr && ant documentation}} (or {{ant precommit}}) would first > build the javadocs; then build the ref-guide; then validate _all_ links i > nthe ref-guide, even those to (local) javadocs > While the "local.javadocs" property logic _inside_ the > solr-ref-guide/build.xml was ported to build.gradle, the logic to leverage > this functionality from the "solr" project doesn't seem to have been > preserved -- so currently, {{gradle check}} doesn't know/care if someone adds > a nonsense javadoc link to the ref-guide (or removes a class/method whose > javadoc is already currently to from the ref guide) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova merged pull request #1943: LUCENE-9555: Advance conjuction Iterator for two phase iteration
mayya-sharipova merged pull request #1943: URL: https://github.com/apache/lucene-solr/pull/1943 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14066) Deprecate DIH and migrate to a community supported package
[ https://issues.apache.org/jira/browse/SOLR-14066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208861#comment-17208861 ] Rui Pimentel commented on SOLR-14066: - Hi Erich Siffert, Did you try with Oracle > Deprecate DIH and migrate to a community supported package > -- > > Key: SOLR-14066 > URL: https://issues.apache.org/jira/browse/SOLR-14066 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Blocker > Fix For: 8.6 > > Attachments: image-2019-12-14-19-58-39-314.png > > Time Spent: 1h 10m > Remaining Estimate: 0h > > DIH doesn't need to remain inside Solr anymore. Plan is to deprecate DIH in > 8.6, remove from 9.0. A community supported version of DIH (which can be used > with Solr's package manager) can be found here > https://github.com/rohitbemax/dataimporthandler. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova commented on a change in pull request #1952: LUCENE-9565 Fix competitive iteration
mayya-sharipova commented on a change in pull request #1952: URL: https://github.com/apache/lucene-solr/pull/1952#discussion_r500412063 ## File path: lucene/core/src/java/org/apache/lucene/search/Weight.java ## @@ -204,9 +204,14 @@ public int score(LeafCollector collector, Bits acceptDocs, int min, int max) thr collector.setScorer(scorer); DocIdSetIterator scorerIterator = twoPhase == null ? iterator : twoPhase.approximation(); DocIdSetIterator collectorIterator = collector.competitiveIterator(); - // if possible filter scorerIterator to keep only competitive docs as defined by collector - DocIdSetIterator filteredIterator = collectorIterator == null ? scorerIterator : - ConjunctionDISI.intersectIterators(Arrays.asList(scorerIterator, collectorIterator)); + DocIdSetIterator filteredIterator = scorerIterator; + if (collectorIterator != null) { +if (scorerIterator.docID() != -1) { + collectorIterator.advance(scorerIterator.docID()); +} Review comment: @jpountz Addressed in cba6cf75f48a26cd48f09152c25518c91fd6660d This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14870) gradle build does not validate ref-guide -> javadoc links
[ https://issues.apache.org/jira/browse/SOLR-14870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208864#comment-17208864 ] Dawid Weiss commented on SOLR-14870: Perhaps I should clarify that if the intention was to pull out common configuration and logic then it'd be more natural to define multiple tasks and just run the same configuration closure over all these tasks (that handles common setup). We do use it in a few places, like in solr-forbidden-apis.gradle, for example: {code} configure(project(":solr:core")) { tasks.matching { it.name == "forbiddenApisMain" || it.name == "forbiddenApisTest" }.all { exclude "org/apache/solr/internal/**" exclude "org/apache/hadoop/**" } } {code} this applies the same configuration to two tasks (selected by name). You could just as well do something like: configure([task1, task2]) { // common setup. } and it'd work too. > gradle build does not validate ref-guide -> javadoc links > - > > Key: SOLR-14870 > URL: https://issues.apache.org/jira/browse/SOLR-14870 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-14870.patch > > > the ant build had (has on 8x) a feature that ensured we didn't have any > broken links between the ref guide and the javadocs... > {code} > depends="javadocs,changes-to-html,process-webpages"> > inheritall="false"> > > > > > {code} > ...by default {{cd solr/solr-ref-guide && ant bare-bones-html-validation}} > just did interanal validation of the strucure of the guide, but this hook > ment that {{cd solr && ant documentation}} (or {{ant precommit}}) would first > build the javadocs; then build the ref-guide; then validate _all_ links i > nthe ref-guide, even those to (local) javadocs > While the "local.javadocs" property logic _inside_ the > solr-ref-guide/build.xml was ported to build.gradle, the logic to leverage > this functionality from the "solr" project doesn't seem to have been > preserved -- so currently, {{gradle check}} doesn't know/care if someone adds > a nonsense javadoc link to the ref-guide (or removes a class/method whose > javadoc is already currently to from the ref guide) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14870) gradle build does not validate ref-guide -> javadoc links
[ https://issues.apache.org/jira/browse/SOLR-14870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208864#comment-17208864 ] Dawid Weiss edited comment on SOLR-14870 at 10/6/20, 3:56 PM: -- Perhaps I should clarify that if the intention was to pull out common configuration and logic then it'd be more natural to define multiple tasks and just run the same configuration closure over all these tasks (that handles common setup). We do use it in a few places, like in solr-forbidden-apis.gradle, for example: {code} configure(project(":solr:core")) { tasks.matching { it.name == "forbiddenApisMain" || it.name == "forbiddenApisTest" }.all { exclude "org/apache/solr/internal/**" exclude "org/apache/hadoop/**" } } {code} this applies the same configuration to two tasks (selected by name). You could just as well do something like: {code} configure([task1, task2]) { // common setup. } {code} and it'd work too. was (Author: dweiss): Perhaps I should clarify that if the intention was to pull out common configuration and logic then it'd be more natural to define multiple tasks and just run the same configuration closure over all these tasks (that handles common setup). We do use it in a few places, like in solr-forbidden-apis.gradle, for example: {code} configure(project(":solr:core")) { tasks.matching { it.name == "forbiddenApisMain" || it.name == "forbiddenApisTest" }.all { exclude "org/apache/solr/internal/**" exclude "org/apache/hadoop/**" } } {code} this applies the same configuration to two tasks (selected by name). You could just as well do something like: configure([task1, task2]) { // common setup. } and it'd work too. > gradle build does not validate ref-guide -> javadoc links > - > > Key: SOLR-14870 > URL: https://issues.apache.org/jira/browse/SOLR-14870 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-14870.patch > > > the ant build had (has on 8x) a feature that ensured we didn't have any > broken links between the ref guide and the javadocs... > {code} > depends="javadocs,changes-to-html,process-webpages"> > inheritall="false"> > > > > > {code} > ...by default {{cd solr/solr-ref-guide && ant bare-bones-html-validation}} > just did interanal validation of the strucure of the guide, but this hook > ment that {{cd solr && ant documentation}} (or {{ant precommit}}) would first > build the javadocs; then build the ref-guide; then validate _all_ links i > nthe ref-guide, even those to (local) javadocs > While the "local.javadocs" property logic _inside_ the > solr-ref-guide/build.xml was ported to build.gradle, the logic to leverage > this functionality from the "solr" project doesn't seem to have been > preserved -- so currently, {{gradle check}} doesn't know/care if someone adds > a nonsense javadoc link to the ref-guide (or removes a class/method whose > javadoc is already currently to from the ref guide) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14066) Deprecate DIH and migrate to a community supported package
[ https://issues.apache.org/jira/browse/SOLR-14066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208861#comment-17208861 ] Rui Pimentel edited comment on SOLR-14066 at 10/6/20, 4:03 PM: --- Hi Erich Siffert/Ishan Chattopadhyaya Did you try with Oracle ? was (Author: informat...@spautores.pt): Hi Erich Siffert/Ishan Chattopadhyaya Did you try with Oracle > Deprecate DIH and migrate to a community supported package > -- > > Key: SOLR-14066 > URL: https://issues.apache.org/jira/browse/SOLR-14066 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Blocker > Fix For: 8.6 > > Attachments: image-2019-12-14-19-58-39-314.png > > Time Spent: 1h 10m > Remaining Estimate: 0h > > DIH doesn't need to remain inside Solr anymore. Plan is to deprecate DIH in > 8.6, remove from 9.0. A community supported version of DIH (which can be used > with Solr's package manager) can be found here > https://github.com/rohitbemax/dataimporthandler. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14066) Deprecate DIH and migrate to a community supported package
[ https://issues.apache.org/jira/browse/SOLR-14066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208861#comment-17208861 ] Rui Pimentel edited comment on SOLR-14066 at 10/6/20, 4:03 PM: --- Hi Erich Siffert/Ishan Chattopadhyaya Did you try with Oracle was (Author: informat...@spautores.pt): Hi Erich Siffert, Did you try with Oracle > Deprecate DIH and migrate to a community supported package > -- > > Key: SOLR-14066 > URL: https://issues.apache.org/jira/browse/SOLR-14066 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Blocker > Fix For: 8.6 > > Attachments: image-2019-12-14-19-58-39-314.png > > Time Spent: 1h 10m > Remaining Estimate: 0h > > DIH doesn't need to remain inside Solr anymore. Plan is to deprecate DIH in > 8.6, remove from 9.0. A community supported version of DIH (which can be used > with Solr's package manager) can be found here > https://github.com/rohitbemax/dataimporthandler. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on a change in pull request #1952: LUCENE-9565 Fix competitive iteration
jpountz commented on a change in pull request #1952: URL: https://github.com/apache/lucene-solr/pull/1952#discussion_r500453140 ## File path: lucene/core/src/java/org/apache/lucene/search/Weight.java ## @@ -204,9 +204,14 @@ public int score(LeafCollector collector, Bits acceptDocs, int min, int max) thr collector.setScorer(scorer); DocIdSetIterator scorerIterator = twoPhase == null ? iterator : twoPhase.approximation(); DocIdSetIterator collectorIterator = collector.competitiveIterator(); - // if possible filter scorerIterator to keep only competitive docs as defined by collector - DocIdSetIterator filteredIterator = collectorIterator == null ? scorerIterator : - ConjunctionDISI.intersectIterators(Arrays.asList(scorerIterator, collectorIterator)); + DocIdSetIterator filteredIterator = scorerIterator; + if (collectorIterator != null) { +if (scorerIterator.docID() != -1) { + collectorIterator.advance(scorerIterator.docID()); +} Review comment: oh, I had not considered setting `scorerIterator.docID()` as a min docID, maybe this means that we no longer need the `min` parameter of `RangeDISIWrapper`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova commented on a change in pull request #1952: LUCENE-9565 Fix competitive iteration
mayya-sharipova commented on a change in pull request #1952: URL: https://github.com/apache/lucene-solr/pull/1952#discussion_r500461526 ## File path: lucene/core/src/java/org/apache/lucene/search/Weight.java ## @@ -204,9 +204,14 @@ public int score(LeafCollector collector, Bits acceptDocs, int min, int max) thr collector.setScorer(scorer); DocIdSetIterator scorerIterator = twoPhase == null ? iterator : twoPhase.approximation(); DocIdSetIterator collectorIterator = collector.competitiveIterator(); - // if possible filter scorerIterator to keep only competitive docs as defined by collector - DocIdSetIterator filteredIterator = collectorIterator == null ? scorerIterator : - ConjunctionDISI.intersectIterators(Arrays.asList(scorerIterator, collectorIterator)); + DocIdSetIterator filteredIterator = scorerIterator; + if (collectorIterator != null) { +if (scorerIterator.docID() != -1) { + collectorIterator.advance(scorerIterator.docID()); +} Review comment: Thanks @jpountz , addressed in d42c4649c81364f13c51c0b147dd600e57d7 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova merged pull request #1952: LUCENE-9565 Fix competitive iteration
mayya-sharipova merged pull request #1952: URL: https://github.com/apache/lucene-solr/pull/1952 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9565) Fix iteration over competitive iterators
[ https://issues.apache.org/jira/browse/LUCENE-9565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208971#comment-17208971 ] ASF subversion and git services commented on LUCENE-9565: - Commit 874c446ab945aded465d12f01085af89b83563c6 in lucene-solr's branch refs/heads/master from Mayya Sharipova [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=874c446 ] LUCENE-9565 Fix competitive iteration (#1952) PR #1351 introduced a sort optimization where documents can be skipped. But iteration over competitive iterators was not properly organized, as they were not storing the current docID, and when competitive iterator was updated the current doc ID was lost. This patch fixed it. Relates to #1351 > Fix iteration over competitive iterators > > > Key: LUCENE-9565 > URL: https://issues.apache.org/jira/browse/LUCENE-9565 > Project: Lucene - Core > Issue Type: Bug >Reporter: Mayya Sharipova >Priority: Minor > Time Spent: 1h > Remaining Estimate: 0h > > LUCENE-9280 introduced a sort optimization where documents can be skipped. > But iteration over competitive iterators was not properly organized, > as they were not storing the current docID, and > when competitive iterator was updated the current doc ID was lost. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9541) Ensure sub-iterators of ConjunctionDISI are on the same document
[ https://issues.apache.org/jira/browse/LUCENE-9541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208974#comment-17208974 ] ASF subversion and git services commented on LUCENE-9541: - Commit 6b8288445f96e3cb5715e0f106ea2202dab57561 in lucene-solr's branch refs/heads/master from Mayya Sharipova [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=6b82884 ] LUCENE-9541 ConjunctionDISI sub-iterators check (#1937) * LUCENE-9541 ConjunctionDISI sub-iterators check Ensure sub-iterators of a conjunction iterator are on the same doc. > Ensure sub-iterators of ConjunctionDISI are on the same document > > > Key: LUCENE-9541 > URL: https://issues.apache.org/jira/browse/LUCENE-9541 > Project: Lucene - Core > Issue Type: Bug >Reporter: Mayya Sharipova >Priority: Minor > Time Spent: 1h 20m > Remaining Estimate: 0h > > Not completely sure if this is a bug. > BitSetConjuctionDISI advances based on its lead – DocIdSetIterator iterator, > and doesn't consider that its another component – BitSetIterator may have > already advanced passed a certain doc. This may result in duplicate documents. > For example if BitSetConjuctionDISI _disi_ is composed of DocIdSetIterator > _a_ of docs [0,1] and BitSetIterator _b_ of docs [0,1]. Doing `b.nextDoc()` > we are collecting doc0, doing `disi.nextDoc` we again collecting the same > doc0. > It seems that other conjunction iterators don't have this behaviour, if we > are advancing any of their component pass a certain document, the whole > conjunction iterator will also be advanced pass this document. > > This behaviour was exposed in this > [PR|https://github.com/apache/lucene-solr/pull/1903]. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9541) Ensure sub-iterators of ConjunctionDISI are on the same document
[ https://issues.apache.org/jira/browse/LUCENE-9541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208973#comment-17208973 ] ASF subversion and git services commented on LUCENE-9541: - Commit 6b8288445f96e3cb5715e0f106ea2202dab57561 in lucene-solr's branch refs/heads/master from Mayya Sharipova [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=6b82884 ] LUCENE-9541 ConjunctionDISI sub-iterators check (#1937) * LUCENE-9541 ConjunctionDISI sub-iterators check Ensure sub-iterators of a conjunction iterator are on the same doc. > Ensure sub-iterators of ConjunctionDISI are on the same document > > > Key: LUCENE-9541 > URL: https://issues.apache.org/jira/browse/LUCENE-9541 > Project: Lucene - Core > Issue Type: Bug >Reporter: Mayya Sharipova >Priority: Minor > Time Spent: 1h 20m > Remaining Estimate: 0h > > Not completely sure if this is a bug. > BitSetConjuctionDISI advances based on its lead – DocIdSetIterator iterator, > and doesn't consider that its another component – BitSetIterator may have > already advanced passed a certain doc. This may result in duplicate documents. > For example if BitSetConjuctionDISI _disi_ is composed of DocIdSetIterator > _a_ of docs [0,1] and BitSetIterator _b_ of docs [0,1]. Doing `b.nextDoc()` > we are collecting doc0, doing `disi.nextDoc` we again collecting the same > doc0. > It seems that other conjunction iterators don't have this behaviour, if we > are advancing any of their component pass a certain document, the whole > conjunction iterator will also be advanced pass this document. > > This behaviour was exposed in this > [PR|https://github.com/apache/lucene-solr/pull/1903]. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9565) Fix iteration over competitive iterators
[ https://issues.apache.org/jira/browse/LUCENE-9565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208979#comment-17208979 ] ASF subversion and git services commented on LUCENE-9565: - Commit 16d25ace3f4c273d45284c0e7ebf9915b245995a in lucene-solr's branch refs/heads/branch_8x from Mayya Sharipova [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=16d25ac ] LUCENE-9565 Fix competitive iteration (#1952) PR #1351 introduced a sort optimization where documents can be skipped. But iteration over competitive iterators was not properly organized, as they were not storing the current docID, and when competitive iterator was updated the current doc ID was lost. This patch fixed it. Relates to #1351 > Fix iteration over competitive iterators > > > Key: LUCENE-9565 > URL: https://issues.apache.org/jira/browse/LUCENE-9565 > Project: Lucene - Core > Issue Type: Bug >Reporter: Mayya Sharipova >Priority: Minor > Time Spent: 1h > Remaining Estimate: 0h > > LUCENE-9280 introduced a sort optimization where documents can be skipped. > But iteration over competitive iterators was not properly organized, > as they were not storing the current docID, and > when competitive iterator was updated the current doc ID was lost. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9541) Ensure sub-iterators of ConjunctionDISI are on the same document
[ https://issues.apache.org/jira/browse/LUCENE-9541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208980#comment-17208980 ] ASF subversion and git services commented on LUCENE-9541: - Commit 66c49a354023a6e77b67839cc59a1e499fcd6536 in lucene-solr's branch refs/heads/branch_8x from Mayya Sharipova [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=66c49a3 ] LUCENE-9541 ConjunctionDISI sub-iterators check (#1937) Ensure sub-iterators of a conjunction iterator are on the same doc. > Ensure sub-iterators of ConjunctionDISI are on the same document > > > Key: LUCENE-9541 > URL: https://issues.apache.org/jira/browse/LUCENE-9541 > Project: Lucene - Core > Issue Type: Bug >Reporter: Mayya Sharipova >Priority: Minor > Time Spent: 1h 20m > Remaining Estimate: 0h > > Not completely sure if this is a bug. > BitSetConjuctionDISI advances based on its lead – DocIdSetIterator iterator, > and doesn't consider that its another component – BitSetIterator may have > already advanced passed a certain doc. This may result in duplicate documents. > For example if BitSetConjuctionDISI _disi_ is composed of DocIdSetIterator > _a_ of docs [0,1] and BitSetIterator _b_ of docs [0,1]. Doing `b.nextDoc()` > we are collecting doc0, doing `disi.nextDoc` we again collecting the same > doc0. > It seems that other conjunction iterators don't have this behaviour, if we > are advancing any of their component pass a certain document, the whole > conjunction iterator will also be advanced pass this document. > > This behaviour was exposed in this > [PR|https://github.com/apache/lucene-solr/pull/1903]. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9564) Format code automatically and enforce it
[ https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209000#comment-17209000 ] Robert Muir commented on LUCENE-9564: - https://github.com/psf/black#migrating-your-code-style-without-ruining-git-blame > Format code automatically and enforce it > > > Key: LUCENE-9564 > URL: https://issues.apache.org/jira/browse/LUCENE-9564 > Project: Lucene - Core > Issue Type: Task >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Time Spent: 10m > Remaining Estimate: 0h > > This is a trivial change but a bold move. And I'm sure it's not for everyone. > I started using google java format [1] in my projects a while ago and have > never looked back since. It is an oracle-style formatter (doesn't allow > customizations or deviations from the defined 'ideal') - this takes some > getting used to - but it also eliminates *all* the potential differences > between IDEs, configs, etc. And the formatted code typically looks much > better than hand-edited one. It is also verifiable on precommit (so you can't > commit code that deviates from what you'd get from automated formatting > output). > The biggest benefit I see is that refactorings become such a joy and keep the > code neat, everywhere. Before you commit you just reformat everything > automatically, no matter how much you messed it up. > This isn't a change for everyone. I myself love hand-edited, neat code... but > the reality is that with IDE support for automated code changes and so many > people with different styles working on the same codebase keeping it neat is > a big pain. > Checkstyle and other tools are fine for ensuring certain rules but they don't > take the burden of formatting off your shoulders. This tool does. > Like I said - I had *great* reservations about using it at the beginning but > over time got so used to it that I almost can't live without it now. It's > like magic - you play with the code in any way you like, then run formatting > and it's nice and neat. > The downside is that automated formatting does imply potential merge problems > in backward patches (or any currently existing branches). > Like I said, it is a bold move. Just throwing this for your consideration. > -I've added a PR that adds spotless but it's not ready; some files would have > to be excluded as they currently violate header rules.- > A more interesting thing is here where the current code is automatically > reformatted - this branch is for eyeballing only. > https://github.com/dweiss/lucene-solr/compare/LUCENE-9564...dweiss:LUCENE-9564-example > [1] https://google.github.io/styleguide/javaguide.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] sigram commented on pull request #1951: SOLR-14691: Reduce object creation by using MapWriter / IteratorWriter.
sigram commented on pull request #1951: URL: https://github.com/apache/lucene-solr/pull/1951#issuecomment-704446815 > Can we get rid of the interface PropertyFilter and replace it with a Predicate? Again, we're limited by back-compat in 8x. We could add another set of methods in `MetricUtils` that use `Predicate` and deprecate `PropertyFilter` ... but that would double the number of methods. > MetricUtils.addMetrics() Not sure what you're referring to - this method doesn't create a NamedList. It's used only in `OverseerStatusCmd`, which creates a NamedList anyway for other stuff. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] sigram edited a comment on pull request #1951: SOLR-14691: Reduce object creation by using MapWriter / IteratorWriter.
sigram edited a comment on pull request #1951: URL: https://github.com/apache/lucene-solr/pull/1951#issuecomment-704446815 > Can we get rid of the interface PropertyFilter and replace it with a Predicate? Again, we're limited by back-compat in 8x. We could add another set of methods in `MetricUtils` that use `Predicate` and deprecate `PropertyFilter` ... but that would double the number of methods. > MetricUtils.addMetrics() Not sure what you're referring to - this method doesn't create a NamedList. It's used only in `OverseerStatusCmd`, which creates a NamedList anyway for other stuff. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14915) Prometheus-exporter should not depend on Solr-core
David Smiley created SOLR-14915: --- Summary: Prometheus-exporter should not depend on Solr-core Key: SOLR-14915 URL: https://issues.apache.org/jira/browse/SOLR-14915 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Components: contrib - prometheus-exporter Reporter: David Smiley Assignee: David Smiley I think it's *crazy* that our Prometheus exporter depends on Solr-core -- this thing is a _client_ of Solr; it does not live within Solr. The exporter ought to be fairly lean. One consequence of this dependency is that, for example, security vulnerabilities reported against Solr (e.g. Jetty) can (and do, where I work) wind up being reported against this module even though Prometheus isn't using Jetty. >From my evaluation today of what's going on, it appears the crux of the >problem is that the prometheus exporter uses some utility mechanisms in >Solr-core like XmlConfig (which depends on SolrResourceLoader and the rabbit >hole goes deeper...) and DOMUtils (further depends on PropertiesUtil). It can >easy be made to not use XmlConfig. DOMUtils & PropertiesUtil could move to >SolrJ which already has lots of little dependency-free utilities needed by >SolrJ and Solr-core alike. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9564) Format code automatically and enforce it
[ https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209025#comment-17209025 ] Dawid Weiss commented on LUCENE-9564: - Bull's eye! Thanks Robert. Interesting tool too - similar motivation to the one I tried to outline. > Format code automatically and enforce it > > > Key: LUCENE-9564 > URL: https://issues.apache.org/jira/browse/LUCENE-9564 > Project: Lucene - Core > Issue Type: Task >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Time Spent: 10m > Remaining Estimate: 0h > > This is a trivial change but a bold move. And I'm sure it's not for everyone. > I started using google java format [1] in my projects a while ago and have > never looked back since. It is an oracle-style formatter (doesn't allow > customizations or deviations from the defined 'ideal') - this takes some > getting used to - but it also eliminates *all* the potential differences > between IDEs, configs, etc. And the formatted code typically looks much > better than hand-edited one. It is also verifiable on precommit (so you can't > commit code that deviates from what you'd get from automated formatting > output). > The biggest benefit I see is that refactorings become such a joy and keep the > code neat, everywhere. Before you commit you just reformat everything > automatically, no matter how much you messed it up. > This isn't a change for everyone. I myself love hand-edited, neat code... but > the reality is that with IDE support for automated code changes and so many > people with different styles working on the same codebase keeping it neat is > a big pain. > Checkstyle and other tools are fine for ensuring certain rules but they don't > take the burden of formatting off your shoulders. This tool does. > Like I said - I had *great* reservations about using it at the beginning but > over time got so used to it that I almost can't live without it now. It's > like magic - you play with the code in any way you like, then run formatting > and it's nice and neat. > The downside is that automated formatting does imply potential merge problems > in backward patches (or any currently existing branches). > Like I said, it is a bold move. Just throwing this for your consideration. > -I've added a PR that adds spotless but it's not ready; some files would have > to be excluded as they currently violate header rules.- > A more interesting thing is here where the current code is automatically > reformatted - this branch is for eyeballing only. > https://github.com/dweiss/lucene-solr/compare/LUCENE-9564...dweiss:LUCENE-9564-example > [1] https://google.github.io/styleguide/javaguide.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13392) Unable to start prometheus-exporter in 7x branch
[ https://issues.apache.org/jira/browse/SOLR-13392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209031#comment-17209031 ] David Smiley commented on SOLR-13392: - bq. Do we have a test that tries to launch the exporter to avoid such bugs in the future? Apparently not! Docker is perfect for such a test, and we finally have Docker in our source tree but not yet Jenkins setup (whatever that may be). BTW I just filed SOLR-14915 to change the exporter to not depend on solr-core, which I think is crazy. > Unable to start prometheus-exporter in 7x branch > > > Key: SOLR-13392 > URL: https://issues.apache.org/jira/browse/SOLR-13392 > Project: Solr > Issue Type: Bug > Components: metrics >Affects Versions: 7.7.2 >Reporter: Karl Stoney >Assignee: Shalin Shekhar Mangar >Priority: Major > Fix For: 7.7.2, 8.1, master (9.0) > > Attachments: SOLR-13392.patch > > > Hi, > prometheus-exporter doesn't start in branch 7x on commit > 7dfe1c093b65f77407c2df4c2a1120a213aef166, it does work on > 26b498d0a9d25626a15e25b0cf97c8339114263a so something has changed between > those two commits causing this. > I am presuming it is > https://github.com/apache/lucene-solr/commit/e1eeafb5dc077976646b06f4cba4d77534963fa9#diff-3f7b27f0f087632739effa2aa508d77eR34 > Exception in thread "main" java.lang.NoClassDefFoundError: > org/apache/lucene/util/IOUtils > at > org.apache.solr.core.SolrResourceLoader.close(SolrResourceLoader.java:881) > at > org.apache.solr.prometheus.exporter.SolrExporter.loadMetricsConfiguration(SolrExporter.java:221) > at > org.apache.solr.prometheus.exporter.SolrExporter.main(SolrExporter.java:205) > Caused by: java.lang.ClassNotFoundException: org.apache.lucene.util.IOUtils > at java.net.URLClassLoader.findClass(URLClassLoader.java:382) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > ... 3 more -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14916) Add split parameter to timeseries Streaming Expression
Joel Bernstein created SOLR-14916: - Summary: Add split parameter to timeseries Streaming Expression Key: SOLR-14916 URL: https://issues.apache.org/jira/browse/SOLR-14916 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Reporter: Joel Bernstein Currently the time series function only supports the time aggregations across the dimension. This ticket will add the split parameter which will add a top level split by categorical field, to produce time series reports per each split. The split-limit and split-sort parameters will also be added to control number and order of values in split field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14916) Add split parameter to timeseries Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-14916: -- Description: Currently the time series function only supports aggregations across the time dimension. This ticket will add the *split* parameter which will add a top level split by categorical field, to produce time series reports per each split. The split-rows and split-sort parameters will also be added to control number and order of values in split field. Sample syntax: {code}timeseries(collection1, q="*:*", split="company", split-rows=10, split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", format="-dd-MM" ){code} was: Currently the time series function only supports aggregations across the time dimension. This ticket will add the *split* parameter which will add a top level split by categorical field, to produce time series reports per each split. The split-rows and split-sort parameters will also be added to control number and order of values in split field. Sample syntax: {code}timeserie(collection1, q="*:*", split="company", split-rows=10, split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", format="-dd-MM" ){code} > Add split parameter to timeseries Streaming Expression > -- > > Key: SOLR-14916 > URL: https://issues.apache.org/jira/browse/SOLR-14916 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > Currently the time series function only supports aggregations across the time > dimension. This ticket will add the *split* parameter which will add a top > level split by categorical field, to produce time series reports per each > split. The split-rows and split-sort parameters will also be added to control > number and order of values in split field. > Sample syntax: > {code}timeseries(collection1, q="*:*", split="company", split-rows=10, > split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", > format="-dd-MM" ){code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14916) Add split parameter to timeseries Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-14916: -- Description: Currently the time series function only supports aggregations across the time dimension. This ticket will add the *split* parameter which will add a top level split by categorical field, to produce time series reports per each split. The split-rows and split-sort parameters will also be added to control number and order of values in split field. Sample syntax: {code}timeserie(collection1, q="*:*", split="company", split-rows=10, split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", format="-dd-MM" ){code} was:Currently the time series function only supports the time aggregations across the dimension. This ticket will add the split parameter which will add a top level split by categorical field, to produce time series reports per each split. The split-limit and split-sort parameters will also be added to control number and order of values in split field. > Add split parameter to timeseries Streaming Expression > -- > > Key: SOLR-14916 > URL: https://issues.apache.org/jira/browse/SOLR-14916 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > Currently the time series function only supports aggregations across the time > dimension. This ticket will add the *split* parameter which will add a top > level split by categorical field, to produce time series reports per each > split. The split-rows and split-sort parameters will also be added to control > number and order of values in split field. > Sample syntax: > {code}timeserie(collection1, q="*:*", split="company", split-rows=10, > split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", > format="-dd-MM" ){code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14916) Add split parameter to timeseries Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-14916: -- Description: Currently the time series function only supports aggregations across the time dimension. This ticket will add the *split* parameter which will add a top level split by categorical field, to produce time series reports per each split. The split-rows and split-sort parameters will also be added to control number and order of values in split field. Sample syntax: {code} timeseries(collection1, q="*:*", split="company", split-rows=10, split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", format="-dd-MM" , avg(price_f)) {code} was: Currently the time series function only supports aggregations across the time dimension. This ticket will add the *split* parameter which will add a top level split by categorical field, to produce time series reports per each split. The split-rows and split-sort parameters will also be added to control number and order of values in split field. Sample syntax: {code} timeseries(collection1, q="*:*", split="company", split-rows=10, split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", format="-dd-MM" , avg(price_f)) {code} > Add split parameter to timeseries Streaming Expression > -- > > Key: SOLR-14916 > URL: https://issues.apache.org/jira/browse/SOLR-14916 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > Currently the time series function only supports aggregations across the time > dimension. This ticket will add the *split* parameter which will add a top > level split by categorical field, to produce time series reports per each > split. The split-rows and split-sort parameters will also be added to control > number and order of values in split field. > Sample syntax: > {code} > timeseries(collection1, > q="*:*", >split="company", >split-rows=10, >split-sort="avg(price_f) desc", >field="timefield", >gap="+1DAY", >format="-dd-MM" , >avg(price_f)) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14916) Add split parameter to timeseries Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-14916: -- Description: Currently the time series function only supports aggregations across the time dimension. This ticket will add the *split* parameter which will add a top level split by categorical field, to produce time series reports per each split. The split-rows and split-sort parameters will also be added to control number and order of values in split field. Sample syntax: {code} timeseries(collection1, q="*:*", split="company", split-rows=10, split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", format="-dd-MM" , avg(price_f)) {code} was: Currently the time series function only supports aggregations across the time dimension. This ticket will add the *split* parameter which will add a top level split by categorical field, to produce time series reports per each split. The split-rows and split-sort parameters will also be added to control number and order of values in split field. Sample syntax: {code}timeseries(collection1, q="*:*", split="company", split-rows=10, split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", format="-dd-MM" ){code} > Add split parameter to timeseries Streaming Expression > -- > > Key: SOLR-14916 > URL: https://issues.apache.org/jira/browse/SOLR-14916 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > Currently the time series function only supports aggregations across the time > dimension. This ticket will add the *split* parameter which will add a top > level split by categorical field, to produce time series reports per each > split. The split-rows and split-sort parameters will also be added to control > number and order of values in split field. > Sample syntax: > {code} > timeseries(collection1, >q="*:*", >split="company", >split-rows=10, >split-sort="avg(price_f) desc", >field="timefield", >gap="+1DAY", >format="-dd-MM" , >avg(price_f)) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14916) Add split parameter to timeseries Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-14916: -- Description: Currently the time series function only supports aggregations across the time dimension. This ticket will add the *split* parameter which will add a top level split by categorical field, to produce time series reports per each split. The split-rows and split-sort parameters will also be added to control number and order of values in split field. Sample syntax: {code} timeseries(collection1, q="*:*", split="company", split-rows=10, split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", format="-dd-MM" , avg(price_f)) {code} was: Currently the time series function only supports aggregations across the time dimension. This ticket will add the *split* parameter which will add a top level split by categorical field, to produce time series reports per each split. The split-rows and split-sort parameters will also be added to control number and order of values in split field. Sample syntax: {code} timeseries(collection1, q="*:*", split="company", split-rows=10, split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", format="-dd-MM" , avg(price_f)) {code} > Add split parameter to timeseries Streaming Expression > -- > > Key: SOLR-14916 > URL: https://issues.apache.org/jira/browse/SOLR-14916 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > Currently the time series function only supports aggregations across the time > dimension. This ticket will add the *split* parameter which will add a top > level split by categorical field, to produce time series reports per each > split. The split-rows and split-sort parameters will also be added to control > number and order of values in split field. > Sample syntax: > {code} > timeseries(collection1, >q="*:*", >split="company", >split-rows=10, >split-sort="avg(price_f) desc", >field="timefield", >gap="+1DAY", >format="-dd-MM" , >avg(price_f)) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14916) Add split parameter to timeseries Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-14916: -- Description: Currently the time series function only supports aggregations across the time dimension. This ticket will add the *split* parameter which will add a top level split by categorical field, to produce time series reports per each split. The split-rows and split-sort parameters will also be added to control number and order of values in split field. Sample syntax: {code} timeseries(collection1, q="*:*", split="company", split-rows=10, split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", format="-dd-MM" , avg(price_f)) {code} was: Currently the time series function only supports aggregations across the time dimension. This ticket will add the *split* parameter which will add a top level split by categorical field, to produce time series reports per each split. The split-rows and split-sort parameters will also be added to control number and order of values in split field. Sample syntax: {code} timeseries(collection1, q="*:*", split="company", split-rows=10, split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", format="-dd-MM" , avg(price_f)) {code} > Add split parameter to timeseries Streaming Expression > -- > > Key: SOLR-14916 > URL: https://issues.apache.org/jira/browse/SOLR-14916 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > Currently the time series function only supports aggregations across the time > dimension. This ticket will add the *split* parameter which will add a top > level split by categorical field, to produce time series reports per each > split. The split-rows and split-sort parameters will also be added to control > number and order of values in split field. > Sample syntax: > {code} > timeseries(collection1, >q="*:*", >split="company", >split-rows=10, >split-sort="avg(price_f) desc", >field="timefield", >gap="+1DAY", >format="-dd-MM" , >avg(price_f)) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14916) Add split parameter to timeseries Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-14916: -- Description: Currently the time series function only supports aggregations across the time dimension. This ticket will add the *split* parameter which will add a top level split by categorical field, to produce time series reports per each split. The split-rows and split-sort parameters will also be added to control number and order of values in split field. Sample syntax: {code} timeseries(collection1, q="*:*", split="company", split-rows=10, split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", format="-dd-MM" , avg(price_f)) {code} The output of this can be easily pivoted into a matrix and correlated or clustered like the output of the *facet2D* function. was: Currently the time series function only supports aggregations across the time dimension. This ticket will add the *split* parameter which will add a top level split by categorical field, to produce time series reports per each split. The split-rows and split-sort parameters will also be added to control number and order of values in split field. Sample syntax: {code} timeseries(collection1, q="*:*", split="company", split-rows=10, split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", format="-dd-MM" , avg(price_f)) {code} > Add split parameter to timeseries Streaming Expression > -- > > Key: SOLR-14916 > URL: https://issues.apache.org/jira/browse/SOLR-14916 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > Currently the time series function only supports aggregations across the time > dimension. This ticket will add the *split* parameter which will add a top > level split by categorical field, to produce time series reports per each > split. The split-rows and split-sort parameters will also be added to control > number and order of values in split field. > Sample syntax: > {code} > timeseries(collection1, >q="*:*", >split="company", >split-rows=10, >split-sort="avg(price_f) desc", >field="timefield", >gap="+1DAY", >format="-dd-MM" , >avg(price_f)) > {code} > The output of this can be easily pivoted into a matrix and correlated or > clustered like the output of the *facet2D* function. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14916) Add split parameter to timeseries Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-14916: -- Description: Currently the time series function only supports aggregations across the time dimension. This ticket will add the *split* parameter which will add a top level split by categorical field, to produce time lines per each split. The split-rows and split-sort parameters will also be added to control number and order of values in split field. Sample syntax: {code} timeseries(collection1, q="*:*", split="company", split-rows=10, split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", format="-dd-MM" , avg(price_f)) {code} The output of this can be easily pivoted into a matrix and correlated or clustered like the output of the *facet2D* function. was: Currently the time series function only supports aggregations across the time dimension. This ticket will add the *split* parameter which will add a top level split by categorical field, to produce time series reports per each split. The split-rows and split-sort parameters will also be added to control number and order of values in split field. Sample syntax: {code} timeseries(collection1, q="*:*", split="company", split-rows=10, split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", format="-dd-MM" , avg(price_f)) {code} The output of this can be easily pivoted into a matrix and correlated or clustered like the output of the *facet2D* function. > Add split parameter to timeseries Streaming Expression > -- > > Key: SOLR-14916 > URL: https://issues.apache.org/jira/browse/SOLR-14916 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > Currently the time series function only supports aggregations across the time > dimension. This ticket will add the *split* parameter which will add a top > level split by categorical field, to produce time lines per each split. The > split-rows and split-sort parameters will also be added to control number and > order of values in split field. > Sample syntax: > {code} > timeseries(collection1, >q="*:*", >split="company", >split-rows=10, >split-sort="avg(price_f) desc", >field="timefield", >gap="+1DAY", >format="-dd-MM" , >avg(price_f)) > {code} > The output of this can be easily pivoted into a matrix and correlated or > clustered like the output of the *facet2D* function. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14916) Add split parameter to timeseries Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-14916: -- Description: Currently the time series function only supports aggregations across the time dimension. This ticket will add the *split* parameter which will add a top level split by categorical field, to produce time lines per each split. The split-limit and split-sort parameters will also be added to control number and order of values in split field. Sample syntax: {code} timeseries(collection1, q="*:*", split="company", split-limit=10, split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", format="-dd-MM" , avg(price_f)) {code} The output of this can be easily pivoted into a matrix and correlated or clustered like the output of the *facet2D* function. was: Currently the time series function only supports aggregations across the time dimension. This ticket will add the *split* parameter which will add a top level split by categorical field, to produce time lines per each split. The split-rows and split-sort parameters will also be added to control number and order of values in split field. Sample syntax: {code} timeseries(collection1, q="*:*", split="company", split-rows=10, split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", format="-dd-MM" , avg(price_f)) {code} The output of this can be easily pivoted into a matrix and correlated or clustered like the output of the *facet2D* function. > Add split parameter to timeseries Streaming Expression > -- > > Key: SOLR-14916 > URL: https://issues.apache.org/jira/browse/SOLR-14916 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > Currently the time series function only supports aggregations across the time > dimension. This ticket will add the *split* parameter which will add a top > level split by categorical field, to produce time lines per each split. The > split-limit and split-sort parameters will also be added to control number > and order of values in split field. > Sample syntax: > {code} > timeseries(collection1, >q="*:*", >split="company", >split-limit=10, >split-sort="avg(price_f) desc", >field="timefield", >gap="+1DAY", >format="-dd-MM" , >avg(price_f)) > {code} > The output of this can be easily pivoted into a matrix and correlated or > clustered like the output of the *facet2D* function. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14916) Add split parameter to timeseries Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209046#comment-17209046 ] Joel Bernstein commented on SOLR-14916: --- [~aroopganguly], how does this design look to you? > Add split parameter to timeseries Streaming Expression > -- > > Key: SOLR-14916 > URL: https://issues.apache.org/jira/browse/SOLR-14916 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > Currently the time series function only supports aggregations across the time > dimension. This ticket will add the *split* parameter which will add a top > level split by categorical field, to produce time lines per each split. The > split-limit and split-sort parameters will also be added to control number > and order of values in split field. > Sample syntax: > {code} > timeseries(collection1, >q="*:*", >split="company", >split-limit=10, >split-sort="avg(price_f) desc", >field="timefield", >gap="+1DAY", >format="-dd-MM" , >avg(price_f)) > {code} > The output of this can be easily pivoted into a matrix and correlated or > clustered like the output of the *facet2D* function. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14917) Move DOMUtil and PropertiesUtil to SolrJ, o.a.s.common/util
David Smiley created SOLR-14917: --- Summary: Move DOMUtil and PropertiesUtil to SolrJ, o.a.s.common/util Key: SOLR-14917 URL: https://issues.apache.org/jira/browse/SOLR-14917 Project: Solr Issue Type: Sub-task Security Level: Public (Default Security Level. Issues are Public) Reporter: David Smiley Assignee: David Smiley DOMUtil has some XML DOM utilities, and PropertiesUtil has property substitution utilities. They are fairly isolated and can easily move from o.a.s.util in solr-core to o.a.s.common.util package in SolrJ. The Moving of such things should be 9.x, but I suppose in 8.x a deprecated subclass could be added in both of the former locations? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14916) Add split parameter to timeseries Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-14916: -- Description: Currently the time series function only supports aggregations across the time dimension. This ticket will add the *split* parameter which will add a top level split by categorical field, to produce time lines per each split. The split-limit and split-sort parameters will also be added to control number and order of values in the split field result. Sample syntax: {code} timeseries(collection1, q="*:*", split="company", split-limit=10, split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", format="-dd-MM" , avg(price_f)) {code} The output of this can be easily pivoted into a matrix and correlated or clustered like the output of the *facet2D* function. was: Currently the time series function only supports aggregations across the time dimension. This ticket will add the *split* parameter which will add a top level split by categorical field, to produce time lines per each split. The split-limit and split-sort parameters will also be added to control number and order of values in split field. Sample syntax: {code} timeseries(collection1, q="*:*", split="company", split-limit=10, split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", format="-dd-MM" , avg(price_f)) {code} The output of this can be easily pivoted into a matrix and correlated or clustered like the output of the *facet2D* function. > Add split parameter to timeseries Streaming Expression > -- > > Key: SOLR-14916 > URL: https://issues.apache.org/jira/browse/SOLR-14916 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > Currently the time series function only supports aggregations across the time > dimension. This ticket will add the *split* parameter which will add a top > level split by categorical field, to produce time lines per each split. The > split-limit and split-sort parameters will also be added to control number > and order of values in the split field result. > Sample syntax: > {code} > timeseries(collection1, >q="*:*", >split="company", >split-limit=10, >split-sort="avg(price_f) desc", >field="timefield", >gap="+1DAY", >format="-dd-MM" , >avg(price_f)) > {code} > The output of this can be easily pivoted into a matrix and correlated or > clustered like the output of the *facet2D* function. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14916) Add split parameter to timeseries Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-14916: -- Description: Currently the time series function only supports aggregations across the time dimension. This ticket will add the *split* parameter which will add a top level split by categorical field, to produce time lines per each split. The split-limit and split-sort parameters will also be added to control number and order of values in the split field result. Sample syntax: {code} timeseries(collection1, q="*:*", split="company", split-limit=10, split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", format="-dd-MM" , avg(price_f)) {code} The output of this can be easily pivoted into a matrix and correlated or clustered like the output of the *facet2D* function. The *diff* function already supports the serial differencing of matrix columns so it's very easy to perform cluster etc.. on this output. was: Currently the time series function only supports aggregations across the time dimension. This ticket will add the *split* parameter which will add a top level split by categorical field, to produce time lines per each split. The split-limit and split-sort parameters will also be added to control number and order of values in the split field result. Sample syntax: {code} timeseries(collection1, q="*:*", split="company", split-limit=10, split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", format="-dd-MM" , avg(price_f)) {code} The output of this can be easily pivoted into a matrix and correlated or clustered like the output of the *facet2D* function. > Add split parameter to timeseries Streaming Expression > -- > > Key: SOLR-14916 > URL: https://issues.apache.org/jira/browse/SOLR-14916 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > Currently the time series function only supports aggregations across the time > dimension. This ticket will add the *split* parameter which will add a top > level split by categorical field, to produce time lines per each split. The > split-limit and split-sort parameters will also be added to control number > and order of values in the split field result. > Sample syntax: > {code} > timeseries(collection1, >q="*:*", >split="company", >split-limit=10, >split-sort="avg(price_f) desc", >field="timefield", >gap="+1DAY", >format="-dd-MM" , >avg(price_f)) > {code} > The output of this can be easily pivoted into a matrix and correlated or > clustered like the output of the *facet2D* function. The *diff* function > already supports the serial differencing of matrix columns so it's very easy > to perform cluster etc.. on this output. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14916) Add split parameter to timeseries Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-14916: -- Description: Currently the time series function only supports aggregations across the time dimension. This ticket will add the *split* parameter which will add a top level split by categorical field, to produce time lines per each split. The split-limit and split-sort parameters will also be added to control number and order of values in the split field result. Sample syntax: {code} timeseries(collection1, q="*:*", split="company", split-limit=10, split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", format="-dd-MM" , avg(price_f)) {code} The output of this can be easily pivoted into a matrix and correlated or clustered like the output of the *facet2D* function. The *diff* function already supports the serial differencing of matrix columns so it's very easy to perform clustering etc.. on this output. was: Currently the time series function only supports aggregations across the time dimension. This ticket will add the *split* parameter which will add a top level split by categorical field, to produce time lines per each split. The split-limit and split-sort parameters will also be added to control number and order of values in the split field result. Sample syntax: {code} timeseries(collection1, q="*:*", split="company", split-limit=10, split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", format="-dd-MM" , avg(price_f)) {code} The output of this can be easily pivoted into a matrix and correlated or clustered like the output of the *facet2D* function. The *diff* function already supports the serial differencing of matrix columns so it's very easy to perform cluster etc.. on this output. > Add split parameter to timeseries Streaming Expression > -- > > Key: SOLR-14916 > URL: https://issues.apache.org/jira/browse/SOLR-14916 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > Currently the time series function only supports aggregations across the time > dimension. This ticket will add the *split* parameter which will add a top > level split by categorical field, to produce time lines per each split. The > split-limit and split-sort parameters will also be added to control number > and order of values in the split field result. > Sample syntax: > {code} > timeseries(collection1, >q="*:*", >split="company", >split-limit=10, >split-sort="avg(price_f) desc", >field="timefield", >gap="+1DAY", >format="-dd-MM" , >avg(price_f)) > {code} > The output of this can be easily pivoted into a matrix and correlated or > clustered like the output of the *facet2D* function. The *diff* function > already supports the serial differencing of matrix columns so it's very easy > to perform clustering etc.. on this output. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14916) Add split parameter to timeseries Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209051#comment-17209051 ] Aroop commented on SOLR-14916: -- [~jbernste] this looks very neat! > Add split parameter to timeseries Streaming Expression > -- > > Key: SOLR-14916 > URL: https://issues.apache.org/jira/browse/SOLR-14916 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > Currently the time series function only supports aggregations across the time > dimension. This ticket will add the *split* parameter which will add a top > level split by categorical field, to produce time lines per each split. The > split-limit and split-sort parameters will also be added to control number > and order of values in the split field result. > Sample syntax: > {code} > timeseries(collection1, >q="*:*", >split="company", >split-limit=10, >split-sort="avg(price_f) desc", >field="timefield", >gap="+1DAY", >format="-dd-MM" , >avg(price_f)) > {code} > The output of this can be easily pivoted into a matrix and correlated or > clustered like the output of the *facet2D* function. The *diff* function > already supports the serial differencing of matrix columns so it's very easy > to perform clustering etc.. on this output. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14916) Add split parameter to timeseries Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-14916: -- Description: Currently the time series function only supports aggregations across the time dimension. This ticket will add the *split* parameter which will add a top level split by categorical field, to produce time lines per each split. The split-limit and split-sort parameters will also be added to control number and order of values in the split field result. Sample syntax: {code} timeseries(collection1, q="*:*", split="company", split-limit=10, split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", format="-dd-MM" , avg(price_f)) {code} The output of this can be easily pivoted into a matrix and correlated or clustered like the output of the *facet2D* function. The *diff* and *minMaxScale* functions already support operations over matrix rows so it's very easy to perform clustering etc.. on this output. was: Currently the time series function only supports aggregations across the time dimension. This ticket will add the *split* parameter which will add a top level split by categorical field, to produce time lines per each split. The split-limit and split-sort parameters will also be added to control number and order of values in the split field result. Sample syntax: {code} timeseries(collection1, q="*:*", split="company", split-limit=10, split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", format="-dd-MM" , avg(price_f)) {code} The output of this can be easily pivoted into a matrix and correlated or clustered like the output of the *facet2D* function. The *diff* function already supports the serial differencing of matrix columns so it's very easy to perform clustering etc.. on this output. > Add split parameter to timeseries Streaming Expression > -- > > Key: SOLR-14916 > URL: https://issues.apache.org/jira/browse/SOLR-14916 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > Currently the time series function only supports aggregations across the time > dimension. This ticket will add the *split* parameter which will add a top > level split by categorical field, to produce time lines per each split. The > split-limit and split-sort parameters will also be added to control number > and order of values in the split field result. > Sample syntax: > {code} > timeseries(collection1, >q="*:*", >split="company", >split-limit=10, >split-sort="avg(price_f) desc", >field="timefield", >gap="+1DAY", >format="-dd-MM" , >avg(price_f)) > {code} > The output of this can be easily pivoted into a matrix and correlated or > clustered like the output of the *facet2D* function. The *diff* and > *minMaxScale* functions already support operations over matrix rows so it's > very easy to perform clustering etc.. on this output. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9566) TestApproximationSearchEquivalence.testExclusion fails
Mayya Sharipova created LUCENE-9566: --- Summary: TestApproximationSearchEquivalence.testExclusion fails Key: LUCENE-9566 URL: https://issues.apache.org/jira/browse/LUCENE-9566 Project: Lucene - Core Issue Type: Bug Affects Versions: 8.x, master (9.0) Reporter: Mayya Sharipova Test fails since the last changes to sort optimization on doc comparator were merged. I will mute this test for now. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14916) Add split parameter to timeseries Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-14916: -- Description: Currently the time series function only supports aggregations across the time dimension. This ticket will add the *split* parameter which will add a top level split by categorical field, to produce time lines per each split. The split-limit and split-sort parameters will also be added to control the number and order of values in the split field result. Sample syntax: {code} timeseries(collection1, q="*:*", split="company", split-limit=10, split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", format="-dd-MM" , avg(price_f)) {code} The output of this can be easily pivoted into a matrix and correlated or clustered like the output of the *facet2D* function. The *diff* and *minMaxScale* functions already support operations over matrix rows so it's very easy to perform clustering etc.. on this output. was: Currently the time series function only supports aggregations across the time dimension. This ticket will add the *split* parameter which will add a top level split by categorical field, to produce time lines per each split. The split-limit and split-sort parameters will also be added to control number and order of values in the split field result. Sample syntax: {code} timeseries(collection1, q="*:*", split="company", split-limit=10, split-sort="avg(price_f) desc", field="timefield", gap="+1DAY", format="-dd-MM" , avg(price_f)) {code} The output of this can be easily pivoted into a matrix and correlated or clustered like the output of the *facet2D* function. The *diff* and *minMaxScale* functions already support operations over matrix rows so it's very easy to perform clustering etc.. on this output. > Add split parameter to timeseries Streaming Expression > -- > > Key: SOLR-14916 > URL: https://issues.apache.org/jira/browse/SOLR-14916 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > Currently the time series function only supports aggregations across the time > dimension. This ticket will add the *split* parameter which will add a top > level split by categorical field, to produce time lines per each split. The > split-limit and split-sort parameters will also be added to control the > number and order of values in the split field result. > Sample syntax: > {code} > timeseries(collection1, >q="*:*", >split="company", >split-limit=10, >split-sort="avg(price_f) desc", >field="timefield", >gap="+1DAY", >format="-dd-MM" , >avg(price_f)) > {code} > The output of this can be easily pivoted into a matrix and correlated or > clustered like the output of the *facet2D* function. The *diff* and > *minMaxScale* functions already support operations over matrix rows so it's > very easy to perform clustering etc.. on this output. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9540) Investigate double indexing of the fullPathField in the DirectoryTaxonomyWriter
[ https://issues.apache.org/jira/browse/LUCENE-9540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209076#comment-17209076 ] Michael McCandless commented on LUCENE-9540: Phew, thanks for bringing closure [~gworah]! > Investigate double indexing of the fullPathField in the > DirectoryTaxonomyWriter > --- > > Key: LUCENE-9540 > URL: https://issues.apache.org/jira/browse/LUCENE-9540 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Gautam Worah >Priority: Minor > > We may have reason to believe that we are double indexing the fullPathField > postings item in the DirectoryTaxonomyWriter constructor. > This should ideally be a StoredField. > See related discussion in PR https://github.com/apache/lucene-solr/pull/1733/ > Postings are already enabled for facet labels in > [FacetsConfig#L364-L399|https://github.com/apache/lucene-solr/blob/master/lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java#L364-L366] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley opened a new pull request #1953: SOLR-14917: Move DOMUtil and PropertiesUtil to SolrJ
dsmiley opened a new pull request #1953: URL: https://github.com/apache/lucene-solr/pull/1953 https://issues.apache.org/jira/browse/SOLR-14917 Not worth a CHANGES.txt. On 8x, will keep the classes marked with Deprecated annotation and as subclasses so there's no code to maintain This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14916) Add split parameter to timeseries Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209108#comment-17209108 ] Aroop commented on SOLR-14916: -- [~jbernste] what possible values of "gap" will we support and will the "format" have corresponding valid list of values documented or an enum/constants file to that effect created? > Add split parameter to timeseries Streaming Expression > -- > > Key: SOLR-14916 > URL: https://issues.apache.org/jira/browse/SOLR-14916 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > Currently the time series function only supports aggregations across the time > dimension. This ticket will add the *split* parameter which will add a top > level split by categorical field, to produce time lines per each split. The > split-limit and split-sort parameters will also be added to control the > number and order of values in the split field result. > Sample syntax: > {code} > timeseries(collection1, >q="*:*", >split="company", >split-limit=10, >split-sort="avg(price_f) desc", >field="timefield", >gap="+1DAY", >format="-dd-MM" , >avg(price_f)) > {code} > The output of this can be easily pivoted into a matrix and correlated or > clustered like the output of the *facet2D* function. The *diff* and > *minMaxScale* functions already support operations over matrix rows so it's > very easy to perform clustering etc.. on this output. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] kwatters closed pull request #1810: SOLR-14787 - supporting an operator for the payload check query parser to support greater than / less than payload check queries.
kwatters closed pull request #1810: URL: https://github.com/apache/lucene-solr/pull/1810 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14916) Add split parameter to timeseries Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209128#comment-17209128 ] Joel Bernstein commented on SOLR-14916: --- The gap parameter maps to the JSON facet range faceting gap parameter. I believe this documented in Solr's time field docs. > Add split parameter to timeseries Streaming Expression > -- > > Key: SOLR-14916 > URL: https://issues.apache.org/jira/browse/SOLR-14916 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > Currently the time series function only supports aggregations across the time > dimension. This ticket will add the *split* parameter which will add a top > level split by categorical field, to produce time lines per each split. The > split-limit and split-sort parameters will also be added to control the > number and order of values in the split field result. > Sample syntax: > {code} > timeseries(collection1, >q="*:*", >split="company", >split-limit=10, >split-sort="avg(price_f) desc", >field="timefield", >gap="+1DAY", >format="-dd-MM" , >avg(price_f)) > {code} > The output of this can be easily pivoted into a matrix and correlated or > clustered like the output of the *facet2D* function. The *diff* and > *minMaxScale* functions already support operations over matrix rows so it's > very easy to perform clustering etc.. on this output. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14916) Add split parameter to timeseries Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209128#comment-17209128 ] Joel Bernstein edited comment on SOLR-14916 at 10/6/20, 8:18 PM: - The gap parameter maps to the JSON facet range faceting gap parameter. I believe this is documented in Solr's time field docs. was (Author: joel.bernstein): The gap parameter maps to the JSON facet range faceting gap parameter. I believe this documented in Solr's time field docs. > Add split parameter to timeseries Streaming Expression > -- > > Key: SOLR-14916 > URL: https://issues.apache.org/jira/browse/SOLR-14916 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > Currently the time series function only supports aggregations across the time > dimension. This ticket will add the *split* parameter which will add a top > level split by categorical field, to produce time lines per each split. The > split-limit and split-sort parameters will also be added to control the > number and order of values in the split field result. > Sample syntax: > {code} > timeseries(collection1, >q="*:*", >split="company", >split-limit=10, >split-sort="avg(price_f) desc", >field="timefield", >gap="+1DAY", >format="-dd-MM" , >avg(price_f)) > {code} > The output of this can be easily pivoted into a matrix and correlated or > clustered like the output of the *facet2D* function. The *diff* and > *minMaxScale* functions already support operations over matrix rows so it's > very easy to perform clustering etc.. on this output. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14916) Add split parameter to timeseries Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209133#comment-17209133 ] Joel Bernstein commented on SOLR-14916: --- This covers some of gap information: https://lucene.apache.org/solr/guide/8_4/working-with-dates.html#date-math-syntax > Add split parameter to timeseries Streaming Expression > -- > > Key: SOLR-14916 > URL: https://issues.apache.org/jira/browse/SOLR-14916 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > Currently the time series function only supports aggregations across the time > dimension. This ticket will add the *split* parameter which will add a top > level split by categorical field, to produce time lines per each split. The > split-limit and split-sort parameters will also be added to control the > number and order of values in the split field result. > Sample syntax: > {code} > timeseries(collection1, >q="*:*", >split="company", >split-limit=10, >split-sort="avg(price_f) desc", >field="timefield", >gap="+1DAY", >format="-dd-MM" , >avg(price_f)) > {code} > The output of this can be easily pivoted into a matrix and correlated or > clustered like the output of the *facet2D* function. The *diff* and > *minMaxScale* functions already support operations over matrix rows so it's > very easy to perform clustering etc.. on this output. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14916) Add split parameter to timeseries Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209133#comment-17209133 ] Joel Bernstein edited comment on SOLR-14916 at 10/6/20, 8:21 PM: - This covers some of the gap information: https://lucene.apache.org/solr/guide/8_4/working-with-dates.html#date-math-syntax was (Author: joel.bernstein): This covers some of gap information: https://lucene.apache.org/solr/guide/8_4/working-with-dates.html#date-math-syntax > Add split parameter to timeseries Streaming Expression > -- > > Key: SOLR-14916 > URL: https://issues.apache.org/jira/browse/SOLR-14916 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > Currently the time series function only supports aggregations across the time > dimension. This ticket will add the *split* parameter which will add a top > level split by categorical field, to produce time lines per each split. The > split-limit and split-sort parameters will also be added to control the > number and order of values in the split field result. > Sample syntax: > {code} > timeseries(collection1, >q="*:*", >split="company", >split-limit=10, >split-sort="avg(price_f) desc", >field="timefield", >gap="+1DAY", >format="-dd-MM" , >avg(price_f)) > {code} > The output of this can be easily pivoted into a matrix and correlated or > clustered like the output of the *facet2D* function. The *diff* and > *minMaxScale* functions already support operations over matrix rows so it's > very easy to perform clustering etc.. on this output. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on a change in pull request #1952: LUCENE-9565 Fix competitive iteration
jpountz commented on a change in pull request #1952: URL: https://github.com/apache/lucene-solr/pull/1952#discussion_r500574341 ## File path: lucene/core/src/java/org/apache/lucene/search/Weight.java ## @@ -266,4 +274,45 @@ static void scoreAll(LeafCollector collector, DocIdSetIterator iterator, TwoPhas } } + /** + * Wraps an internal docIdSetIterator for it to start with docID = -1 + */ + protected static class RangeDISIWrapper extends DocIdSetIterator { +private final DocIdSetIterator in; +private final int min; +private final int max; +private int docID = -1; + +public RangeDISIWrapper(DocIdSetIterator in, int max) { + this.in = in; + this.min = in.docID(); + this.max = max; +} + +@Override +public int docID() { + return docID; +} + +@Override +public int nextDoc() throws IOException { + return advance(docID + 1); +} + +@Override +public int advance(int target) throws IOException { + target = Math.max(min, target); + if (target >= max) { +return docID = NO_MORE_DOCS; + } + return docID = in.advance(target); Review comment: it just occurred to me that this implementation is not correct in the case that the minimum bound of the range of doc IDs to score is less than the current doc ID of the scorer, have you seen any failures with your change? I wonder that we would need to do ``` if (target >= scorer.docID()) { return scorer.docID(); } ``` but we should create a test that fails without this This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9564) Format code automatically and enforce it
[ https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209146#comment-17209146 ] Adrien Grand commented on LUCENE-9564: -- There are downsides to code formatting (I'll miss 2 spaces indentation :)) but the pros outweigh the cons, +1. I'm looking forward to imports not being reordered when someone using Intellij modifies a file after someone using Eclipse, or not having to even look at code style when reviewing pull requests. I like how file format javadocs look with this formatter. > Format code automatically and enforce it > > > Key: LUCENE-9564 > URL: https://issues.apache.org/jira/browse/LUCENE-9564 > Project: Lucene - Core > Issue Type: Task >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Time Spent: 10m > Remaining Estimate: 0h > > This is a trivial change but a bold move. And I'm sure it's not for everyone. > I started using google java format [1] in my projects a while ago and have > never looked back since. It is an oracle-style formatter (doesn't allow > customizations or deviations from the defined 'ideal') - this takes some > getting used to - but it also eliminates *all* the potential differences > between IDEs, configs, etc. And the formatted code typically looks much > better than hand-edited one. It is also verifiable on precommit (so you can't > commit code that deviates from what you'd get from automated formatting > output). > The biggest benefit I see is that refactorings become such a joy and keep the > code neat, everywhere. Before you commit you just reformat everything > automatically, no matter how much you messed it up. > This isn't a change for everyone. I myself love hand-edited, neat code... but > the reality is that with IDE support for automated code changes and so many > people with different styles working on the same codebase keeping it neat is > a big pain. > Checkstyle and other tools are fine for ensuring certain rules but they don't > take the burden of formatting off your shoulders. This tool does. > Like I said - I had *great* reservations about using it at the beginning but > over time got so used to it that I almost can't live without it now. It's > like magic - you play with the code in any way you like, then run formatting > and it's nice and neat. > The downside is that automated formatting does imply potential merge problems > in backward patches (or any currently existing branches). > Like I said, it is a bold move. Just throwing this for your consideration. > -I've added a PR that adds spotless but it's not ready; some files would have > to be excluded as they currently violate header rules.- > A more interesting thing is here where the current code is automatically > reformatted - this branch is for eyeballing only. > https://github.com/dweiss/lucene-solr/compare/LUCENE-9564...dweiss:LUCENE-9564-example > [1] https://google.github.io/styleguide/javaguide.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14916) Add split parameter to timeseries Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209147#comment-17209147 ] Aroop commented on SOLR-14916: -- Thanks Joel > Add split parameter to timeseries Streaming Expression > -- > > Key: SOLR-14916 > URL: https://issues.apache.org/jira/browse/SOLR-14916 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > Currently the time series function only supports aggregations across the time > dimension. This ticket will add the *split* parameter which will add a top > level split by categorical field, to produce time lines per each split. The > split-limit and split-sort parameters will also be added to control the > number and order of values in the split field result. > Sample syntax: > {code} > timeseries(collection1, >q="*:*", >split="company", >split-limit=10, >split-sort="avg(price_f) desc", >field="timefield", >gap="+1DAY", >format="-dd-MM" , >avg(price_f)) > {code} > The output of this can be easily pivoted into a matrix and correlated or > clustered like the output of the *facet2D* function. The *diff* and > *minMaxScale* functions already support operations over matrix rows so it's > very easy to perform clustering etc.. on this output. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org