date:20201006

[jira] [Commented] (LUCENE-9563) Add .editorConfig

2020-10-06 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208532#comment-17208532
 ] 

Dawid Weiss commented on LUCENE-9563:
-

I really don't know about where to put it - I guess we can just version it in 
the repository?

> Add .editorConfig
> -
>
> Key: LUCENE-9563
> URL: https://issues.apache.org/jira/browse/LUCENE-9563
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Major
>
> I propose adding a ".editorConfig" to the root of the project.  Many text 
> editors and IDEs support this file to declare code style settings such as 
> indentation and more.  In particular, IntelliJ supports this natively and 
> Eclipse has a plugin for it.
> https://editorconfig.org
> I furthermore propose I simply generate this as an export of my current 
> IntelliJ code style, which is a code style I've been using and was originally 
> imported from the Lucene's former IntelliJ config.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9564) Format code automatically and enforce it

2020-10-06 Thread Dawid Weiss (Jira)

Dawid Weiss created LUCENE-9564:
---

 Summary: Format code automatically and enforce it
 Key: LUCENE-9564
 URL: https://issues.apache.org/jira/browse/LUCENE-9564
 Project: Lucene - Core
  Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss


This is a trivial change but a bold move. And I'm sure it's not for everyone.

I started using google java format [1] in my projects a while ago and have 
never looked back since. It is an oracle-style formatter (doesn't allow 
customizations or deviations from the defined 'ideal') - this takes some 
getting used to - but it also eliminates *all* the potential differences 
between IDEs, configs, etc.  And the formatted code typically looks much better 
than hand-edited one. It is also verifiable on precommit (so you can't commit 
code that deviates from what you'd get from automated formatting output).

The biggest benefit I see is that refactorings become such a joy and keep the 
code neat, everywhere. Before you commit you just reformat everything 
automatically, no matter how much you messed it up.

This isn't a change for everyone. I myself love hand-edited, neat code... but 
the reality is that with IDE support for automated code changes and so many 
people with different styles working on the same codebase keeping it neat is a 
big pain. 

Checkstyle and other tools are fine for ensuring certain rules but they don't 
take the burden of formatting off your shoulders. This tool does. 

Like I said - I had *great* reservations about using it at the beginning but 
over time got so used to it that I almost can't live without it now. It's like 
magic - you play with the code in any way you like, then run formatting and 
it's nice and neat.

The downside is that automated formatting does imply potential merge problems 
in backward patches (or any currently existing branches).

Like I said, it is a bold move. Just throwing this for your consideration.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dweiss opened a new pull request #1950: LUCENE-9564: add spotless and gjf.

2020-10-06 Thread GitBox



dweiss opened a new pull request #1950:
URL: https://github.com/apache/lucene-solr/pull/1950


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-9564) Format code automatically and enforce it

2020-10-06 Thread Dawid Weiss (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated LUCENE-9564:

Description: 
This is a trivial change but a bold move. And I'm sure it's not for everyone.

I started using google java format [1] in my projects a while ago and have 
never looked back since. It is an oracle-style formatter (doesn't allow 
customizations or deviations from the defined 'ideal') - this takes some 
getting used to - but it also eliminates *all* the potential differences 
between IDEs, configs, etc.  And the formatted code typically looks much better 
than hand-edited one. It is also verifiable on precommit (so you can't commit 
code that deviates from what you'd get from automated formatting output).

The biggest benefit I see is that refactorings become such a joy and keep the 
code neat, everywhere. Before you commit you just reformat everything 
automatically, no matter how much you messed it up.

This isn't a change for everyone. I myself love hand-edited, neat code... but 
the reality is that with IDE support for automated code changes and so many 
people with different styles working on the same codebase keeping it neat is a 
big pain. 

Checkstyle and other tools are fine for ensuring certain rules but they don't 
take the burden of formatting off your shoulders. This tool does. 

Like I said - I had *great* reservations about using it at the beginning but 
over time got so used to it that I almost can't live without it now. It's like 
magic - you play with the code in any way you like, then run formatting and 
it's nice and neat.

The downside is that automated formatting does imply potential merge problems 
in backward patches (or any currently existing branches).

Like I said, it is a bold move. Just throwing this for your consideration.

I've added a PR that adds spotless but it's not ready; some files would have to 
be excluded as they currently violate header rules.

A more interesting thing is here where the current code is automatically 
reformatted, for eyeballing only.

https://github.com/dweiss/lucene-solr/commit/80e8f14ca61a13781bc812967a9e38cdbe656041

  was:
This is a trivial change but a bold move. And I'm sure it's not for everyone.

I started using google java format [1] in my projects a while ago and have 
never looked back since. It is an oracle-style formatter (doesn't allow 
customizations or deviations from the defined 'ideal') - this takes some 
getting used to - but it also eliminates *all* the potential differences 
between IDEs, configs, etc.  And the formatted code typically looks much better 
than hand-edited one. It is also verifiable on precommit (so you can't commit 
code that deviates from what you'd get from automated formatting output).

The biggest benefit I see is that refactorings become such a joy and keep the 
code neat, everywhere. Before you commit you just reformat everything 
automatically, no matter how much you messed it up.

This isn't a change for everyone. I myself love hand-edited, neat code... but 
the reality is that with IDE support for automated code changes and so many 
people with different styles working on the same codebase keeping it neat is a 
big pain. 

Checkstyle and other tools are fine for ensuring certain rules but they don't 
take the burden of formatting off your shoulders. This tool does. 

Like I said - I had *great* reservations about using it at the beginning but 
over time got so used to it that I almost can't live without it now. It's like 
magic - you play with the code in any way you like, then run formatting and 
it's nice and neat.

The downside is that automated formatting does imply potential merge problems 
in backward patches (or any currently existing branches).

Like I said, it is a bold move. Just throwing this for your consideration.


> Format code automatically and enforce it
> 
>
> Key: LUCENE-9564
> URL: https://issues.apache.org/jira/browse/LUCENE-9564
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is a trivial change but a bold move. And I'm sure it's not for everyone.
> I started using google java format [1] in my projects a while ago and have 
> never looked back since. It is an oracle-style formatter (doesn't allow 
> customizations or deviations from the defined 'ideal') - this takes some 
> getting used to - but it also eliminates *all* the potential differences 
> between IDEs, configs, etc.  And the formatted code typically looks much 
> better than hand-edited one. It is also verifiable on precommit (so you can't 
> commit code that deviates from what you'd get from automated formatting 
> output).
> The biggest benefit I see is that refactor

[jira] [Commented] (LUCENE-9493) Remove obsolete dev-tools/{idea,netbeans,maven} folders

2020-10-06 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208555#comment-17208555
 ] 

Dawid Weiss commented on LUCENE-9493:
-

> It'd be interesting if the gradle build could detect that IntelliJ is 
> importing it, first time setup in particular, and then tell the user about 
> the copyright profile and any other matter.

It is possible but it's not documented or officially supported... Look at 
intellij-idea.gradle.

> Remove obsolete dev-tools/{idea,netbeans,maven} folders
> ---
>
> Key: LUCENE-9493
> URL: https://issues.apache.org/jira/browse/LUCENE-9493
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
> Fix For: master (9.0)
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I don't think they're used or applicable anymore. Thoughts?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] uschindler commented on a change in pull request #1905: LUCENE-9488 Release with Gradle Part 2

2020-10-06 Thread GitBox



uschindler commented on a change in pull request #1905:
URL: https://github.com/apache/lucene-solr/pull/1905#discussion_r500087063



##
File path: solr/packaging/build.gradle
##
@@ -62,12 +63,17 @@ dependencies {
 
   example project(path: ":solr:example", configuration: "packaging")
   server project(path: ":solr:server", configuration: "packaging")
+
+  // Copy files from documentation output
+  docs project(path: ':solr', configuration: 'docs')

Review comment:
   But where is the documentation config defined. I did not find it 
anywhere else. I understand and I like it, but this is incomplete. And 
currently it does not work (I see no docs in ZIP files).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] uschindler commented on a change in pull request #1905: LUCENE-9488 Release with Gradle Part 2

2020-10-06 Thread GitBox



uschindler commented on a change in pull request #1905:
URL: https://github.com/apache/lucene-solr/pull/1905#discussion_r500088372



##
File path: solr/packaging/build.gradle
##
@@ -62,12 +63,17 @@ dependencies {
 
   example project(path: ":solr:example", configuration: "packaging")
   server project(path: ":solr:server", configuration: "packaging")
+
+  // Copy files from documentation output
+  docs project(path: ':solr', configuration: 'docs')

Review comment:
   So as far as I understand: In the documentation.gradle file I would add 
a "configuration docs" for `:solr` and `:lucene` and then use `builtBy 
'documentation'` there. That's what I am missing here. I was about to set this 
up yesterday, but was not sure if I miss something.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14870) gradle build does not validate ref-guide -> javadoc links

2020-10-06 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208556#comment-17208556
 ] 

Dawid Weiss commented on SOLR-14870:


> The main thing that bogged me down is that the lifecycle of gradle "tasks" 
> really doesn't seem to make much sense, and there isn't a "clean" guide to 
> how to refactor a "task" (instance) into a re-usable "class" that many tasks 
> can be instances of.

It does make sense, it's just different from what you're used to. It's a bit 
like functional programming vs. imperative programming. You can't just "port" 
one into another.

> I tried to start by refactoring 'prepareSources' into a "class PrepareSources 
> extends Sync"

Those tasks are not really meant to be subclassed. They should be composed but 
the composition is typically based on task dependencies rather than in-code 
explicit calls. I suspect in this case your "PrepareSources" task is merely 
configuring the Sync spec - you don't need a specific class, only a task name 
and configuration of the existing Sync task. Easier explained on a concrete 
example than in theory but this is my guess.

> is there really no better way to get the 'main' classpath from the Project 
> then the '...getPlugin(JavaPluginConvention.class)...' hoops i jumped through?

In general there is no "main" classpath - there are source sets, configurations 
and a combination of these two. Java plugin creates two "convention" source 
sets - one for main classes and one for tests but you could have a different 
setup (and naming).

bq. if this is all expected

Yes, I think this is expected. This change reflects module names. Simpler = 
better?

I can take a look at the patch later, unless Uwe beats me to it.




> gradle build does not validate ref-guide -> javadoc links
> -
>
> Key: SOLR-14870
> URL: https://issues.apache.org/jira/browse/SOLR-14870
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-14870.patch
>
>
> the ant build had (has on 8x) a feature that ensured we didn't have any 
> broken links between the ref guide and the javadocs...
> {code}
>  depends="javadocs,changes-to-html,process-webpages">
>  inheritall="false">
>   
>   
> 
>   
> {code}
> ...by default {{cd solr/solr-ref-guide && ant bare-bones-html-validation}} 
> just did interanal validation of the strucure of the guide, but this hook 
> ment that {{cd solr && ant documentation}} (or {{ant precommit}}) would first 
> build the javadocs; then build the ref-guide; then validate _all_ links i 
> nthe ref-guide, even those to (local) javadocs
> While the "local.javadocs" property logic _inside_ the 
> solr-ref-guide/build.xml was ported to build.gradle, the logic to leverage 
> this functionality from the "solr" project doesn't seem to have been 
> preserved -- so currently, {{gradle check}} doesn't know/care if someone adds 
> a nonsense javadoc link to the ref-guide (or removes a class/method whose 
> javadoc is already currently to from the ref guide)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dweiss commented on a change in pull request #1905: LUCENE-9488 Release with Gradle Part 2

2020-10-06 Thread GitBox



dweiss commented on a change in pull request #1905:
URL: https://github.com/apache/lucene-solr/pull/1905#discussion_r500092191



##
File path: solr/packaging/build.gradle
##
@@ -62,12 +63,17 @@ dependencies {
 
   example project(path: ":solr:example", configuration: "packaging")
   server project(path: ":solr:server", configuration: "packaging")
+
+  // Copy files from documentation output
+  docs project(path: ':solr', configuration: 'docs')

Review comment:
   Ah, sorry - yes, you are correct. It should declare artifacts, the 
configuration to attach them to and the outputs. An example is in 
solr/example/build.gradle:
   
   artifacts {
 packaging packagingDir, {
   builtBy assemblePackaging
 }
   } 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] uschindler commented on a change in pull request #1905: LUCENE-9488 Release with Gradle Part 2

2020-10-06 Thread GitBox



uschindler commented on a change in pull request #1905:
URL: https://github.com/apache/lucene-solr/pull/1905#discussion_r500096124



##
File path: solr/packaging/build.gradle
##
@@ -62,12 +63,17 @@ dependencies {
 
   example project(path: ":solr:example", configuration: "packaging")
   server project(path: ":solr:server", configuration: "packaging")
+
+  // Copy files from documentation output
+  docs project(path: ':solr', configuration: 'docs')

Review comment:
   Ah cool. That is what I was missing. I will add this after lunch. I was 
not sure how to define the artifacts. The Gradle documentation was not very 
helpful to me.
   Thanks and sorry for stupid questions. 🤣





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] sigram opened a new pull request #1951: SOLR-14691: Reduce object creation by using MapWriter / IteratorWriter.

2020-10-06 Thread GitBox



sigram opened a new pull request #1951:
URL: https://github.com/apache/lucene-solr/pull/1951


   This replaces nearly all Map / List usage in `MetricUtils` with `MapWriter` 
/ `IteratorWriter`. The PR includes also changes to `Utils.MapWriterJSONWriter` 
to similarly avoid creating Map / List instances and instead serialize 
`MapWriter` / `IteratorWriter` directly.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dweiss commented on a change in pull request #1905: LUCENE-9488 Release with Gradle Part 2

2020-10-06 Thread GitBox



dweiss commented on a change in pull request #1905:
URL: https://github.com/apache/lucene-solr/pull/1905#discussion_r500100437



##
File path: solr/packaging/build.gradle
##
@@ -62,12 +63,17 @@ dependencies {
 
   example project(path: ":solr:example", configuration: "packaging")
   server project(path: ":solr:server", configuration: "packaging")
+
+  // Copy files from documentation output
+  docs project(path: ':solr', configuration: 'docs')

Review comment:
   There are no stupid questions. This is the relevant documentation bit 
here, I believe.
   https://docs.gradle.org/current/userguide/cross_project_publications.html





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14691) Metrics reporting should avoid creating objects

2020-10-06 Thread Andrzej Bialecki (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208572#comment-17208572
 ] 

Andrzej Bialecki commented on SOLR-14691:
-

PR 1951 replaces all use of Map / List in {{MetricUtils}} and also adds similar 
improvements to {{MapWriterJSONWriter}} that is used whenever 
{{Utils.toJsonString(...)}} is used. If there are no objections I'll commit 
this shortly.

> Metrics reporting should avoid creating objects
> ---
>
> Key: SOLR-14691
> URL: https://issues.apache.org/jira/browse/SOLR-14691
> Project: Solr
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Critical
> Fix For: 8.7
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {{MetricUtils}} unnecessarily creates a lot of short-lived objects (maps and 
> lists). This affects GC, especially since metrics are frequently polled by 
> clients. We should refactor it to use {{MapWriter}} as much as possible.
> Alternatively we could provide our wrappers or subclasses of Codahale metrics 
> that implement {{MapWriter}}, then a lot of complexity in {{MetricUtils}} 
> wouldn't be needed at all.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9542) Score returned in search request is original score and not reranked score

2020-10-06 Thread Christine Poerschke (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208576#comment-17208576
 ] 

Christine Poerschke commented on LUCENE-9542:
-

Thanks [~krishan1390] for opening this issue with a link into the code!
{quote}Score returned in search request is original score and not reranked 
score post ...
{quote}
Could you share more details on how to observe and reproduce the issue?

I just tried the following steps on {{branch_7_7}} and {{branch_8x}} and it 
_seemed_ that the reranked score was being returned, perhaps a particular kind 
of query or re-ranking query is needed to encounter the issue?

{code}
git checkout $branch # branch_7_7 or branch_8x

cd solr ; ant dist server # on master branch use gradle equivalent instead

bin/solr start -e techproducts

curl 'http://localhost:8983/solr/techproducts/select?rows=3&fl=id,score&q=*:*'

curl 
'http://localhost:8983/solr/techproducts/select?rows=3&fl=id,score&q=*:*&rq=\{!rerank+reRankQuery=id:VDBDB1A16\}'

curl 
'http://localhost:8983/solr/techproducts/select?rows=3&fl=id,score&q=*:*&rq=\{!rerank+reRankQuery=id:VDBDB1A16+reRankWeight=1\}'

curl 
'http://localhost:8983/solr/techproducts/select?rows=3&fl=id,score&q=*:*&rq=\{!rerank+reRankQuery=id:VDBDB1A16+reRankWeight=2\}'

curl 
'http://localhost:8983/solr/techproducts/select?rows=3&fl=id,score&q=*:*&rq=\{!rerank+reRankQuery=id:VDBDB1A16+reRankWeight=3\}'

bin/solr stop
{code}

> Score returned in search request is original score and not reranked score
> -
>
> Key: LUCENE-9542
> URL: https://issues.apache.org/jira/browse/LUCENE-9542
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 8.0
>Reporter: Krishan
>Priority: Major
>
> Score returned in search request is original score and not reranked score 
> post the changes in https://issues.apache.org/jira/browse/LUCENE-8412.
> Commit - 
> [https://github.com/apache/lucene-solr/commit/55bfadbce115a825a75686fe0bfe71406bc3ee44#diff-4e354f104ed52bd7f620b0c05ae8467d]
> Specifically - 
> if (cmd.getSort() != null && query instanceof RankQuery == false && 
> (cmd.getFlags() & GET_SCORES) != 0) {
>     TopFieldCollector.populateScores(topDocs.scoreDocs, this, query);
> }
> in SolrIndexSearcher.java recomputes the score but outputs only the original 
> score and not the reranked score.
>  
> The issue is cmd.getQuery() is a type of RankQuery but the "query" variable 
> is a boolean query and probably replacing query with cmd.getQuery() should be 
> the right fix for this so that the score is not overriden for rerank queries
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dweiss commented on a change in pull request #1905: LUCENE-9488 Release with Gradle Part 2

2020-10-06 Thread GitBox



dweiss commented on a change in pull request #1905:
URL: https://github.com/apache/lucene-solr/pull/1905#discussion_r500092191



##
File path: solr/packaging/build.gradle
##
@@ -62,12 +63,17 @@ dependencies {
 
   example project(path: ":solr:example", configuration: "packaging")
   server project(path: ":solr:server", configuration: "packaging")
+
+  // Copy files from documentation output
+  docs project(path: ':solr', configuration: 'docs')

Review comment:
   Ah, sorry - yes, you are correct. It should declare artifacts, the 
configuration to attach them to and the outputs. An example is in 
solr/example/build.gradle:
   
   ```
   artifacts {
 packaging packagingDir, {
   builtBy assemblePackaging
 }
   } 
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9564) Format code automatically and enforce it

2020-10-06 Thread Michael McCandless (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208615#comment-17208615
 ] 

Michael McCandless commented on LUCENE-9564:


+1 to achieve more consistent code styling, with minimal hassle.

How/when would it automatically run?

How far from Lucene's code style guidelines (WARNING PDF: [Sun's coding 
style|http://www.oracle.com/technetwork/java/codeconventions-150003.pdf] except 
2-space indent) is this one's?

> Format code automatically and enforce it
> 
>
> Key: LUCENE-9564
> URL: https://issues.apache.org/jira/browse/LUCENE-9564
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is a trivial change but a bold move. And I'm sure it's not for everyone.
> I started using google java format [1] in my projects a while ago and have 
> never looked back since. It is an oracle-style formatter (doesn't allow 
> customizations or deviations from the defined 'ideal') - this takes some 
> getting used to - but it also eliminates *all* the potential differences 
> between IDEs, configs, etc.  And the formatted code typically looks much 
> better than hand-edited one. It is also verifiable on precommit (so you can't 
> commit code that deviates from what you'd get from automated formatting 
> output).
> The biggest benefit I see is that refactorings become such a joy and keep the 
> code neat, everywhere. Before you commit you just reformat everything 
> automatically, no matter how much you messed it up.
> This isn't a change for everyone. I myself love hand-edited, neat code... but 
> the reality is that with IDE support for automated code changes and so many 
> people with different styles working on the same codebase keeping it neat is 
> a big pain. 
> Checkstyle and other tools are fine for ensuring certain rules but they don't 
> take the burden of formatting off your shoulders. This tool does. 
> Like I said - I had *great* reservations about using it at the beginning but 
> over time got so used to it that I almost can't live without it now. It's 
> like magic - you play with the code in any way you like, then run formatting 
> and it's nice and neat.
> The downside is that automated formatting does imply potential merge problems 
> in backward patches (or any currently existing branches).
> Like I said, it is a bold move. Just throwing this for your consideration.
> I've added a PR that adds spotless but it's not ready; some files would have 
> to be excluded as they currently violate header rules.
> A more interesting thing is here where the current code is automatically 
> reformatted, for eyeballing only.
> https://github.com/dweiss/lucene-solr/commit/80e8f14ca61a13781bc812967a9e38cdbe656041



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9564) Format code automatically and enforce it

2020-10-06 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208625#comment-17208625
 ] 

Dawid Weiss commented on LUCENE-9564:
-

The check itself could run in precommit. A "cleanup" task is just called 'tidy' 
- I wouldn't make it automatic; the precommit would fail if the code deviated 
from what's required. With time it just becomes a habit to call "gradlew tidy 
precommit"...

bq. How far from Lucene's code style guidelines 

I think it's close to Sun's... but it also handles so many things much better - 
automation is the key to strict consistency here (spaces around operators, 
breaking conditionals into multiple lines, if necessary). It also handles new 
language features well (formats chained calls and closures in a very reasonable 
way). 

Take a look at the diff in that commit above - you'll see the before-after.

My experience with this thing is that whenever I'm not happy with how it 
handled something, it is typically my fault (too complex expressions, too long 
variable names, broken Javadoc).

> Format code automatically and enforce it
> 
>
> Key: LUCENE-9564
> URL: https://issues.apache.org/jira/browse/LUCENE-9564
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is a trivial change but a bold move. And I'm sure it's not for everyone.
> I started using google java format [1] in my projects a while ago and have 
> never looked back since. It is an oracle-style formatter (doesn't allow 
> customizations or deviations from the defined 'ideal') - this takes some 
> getting used to - but it also eliminates *all* the potential differences 
> between IDEs, configs, etc.  And the formatted code typically looks much 
> better than hand-edited one. It is also verifiable on precommit (so you can't 
> commit code that deviates from what you'd get from automated formatting 
> output).
> The biggest benefit I see is that refactorings become such a joy and keep the 
> code neat, everywhere. Before you commit you just reformat everything 
> automatically, no matter how much you messed it up.
> This isn't a change for everyone. I myself love hand-edited, neat code... but 
> the reality is that with IDE support for automated code changes and so many 
> people with different styles working on the same codebase keeping it neat is 
> a big pain. 
> Checkstyle and other tools are fine for ensuring certain rules but they don't 
> take the burden of formatting off your shoulders. This tool does. 
> Like I said - I had *great* reservations about using it at the beginning but 
> over time got so used to it that I almost can't live without it now. It's 
> like magic - you play with the code in any way you like, then run formatting 
> and it's nice and neat.
> The downside is that automated formatting does imply potential merge problems 
> in backward patches (or any currently existing branches).
> Like I said, it is a bold move. Just throwing this for your consideration.
> I've added a PR that adds spotless but it's not ready; some files would have 
> to be excluded as they currently violate header rules.
> A more interesting thing is here where the current code is automatically 
> reformatted, for eyeballing only.
> https://github.com/dweiss/lucene-solr/commit/80e8f14ca61a13781bc812967a9e38cdbe656041



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14914) Add option to disable metrics collection

2020-10-06 Thread Andrzej Bialecki (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki updated SOLR-14914:

Fix Version/s: (was: master (9.0))

> Add option to disable metrics collection
> 
>
> Key: SOLR-14914
> URL: https://issues.apache.org/jira/browse/SOLR-14914
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Major
> Fix For: 8.7
>
>
> Some users have expressed concerns about the overhead of metrics collection, 
> and consequently the need to have an option to turn off the metrics 
> collection altogether.
> Metrics instrumentation in Solr cannot be itself easily removed or bypassed - 
> in order to provide fine-grained metrics many code paths had to be changed 
> and they now expect the metrics to be present (non-null). However, we can use 
> the mechanism of {{MetricSupplier}} to use no-op implementation of all 
> metrics, which would reduce the CPU overhead to basically the cost of an 
> empty method call, and the memory overhead to a HashMap entry in a 
> {{MetricRegistry}} (metric names still need to be tracked).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-14914) Add option to disable metrics collection

2020-10-06 Thread Andrzej Bialecki (Jira)

Andrzej Bialecki created SOLR-14914:
---

 Summary: Add option to disable metrics collection
 Key: SOLR-14914
 URL: https://issues.apache.org/jira/browse/SOLR-14914
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
  Components: metrics
Reporter: Andrzej Bialecki
Assignee: Andrzej Bialecki


Some users have expressed concerns about the overhead of metrics collection, 
and consequently the need to have an option to turn off the metrics collection 
altogether.

Metrics instrumentation in Solr cannot be itself easily removed or bypassed - 
in order to provide fine-grained metrics many code paths had to be changed and 
they now expect the metrics to be present (non-null). However, we can use the 
mechanism of {{MetricSupplier}} to use no-op implementation of all metrics, 
which would reduce the CPU overhead to basically the cost of an empty method 
call, and the memory overhead to a HashMap entry in a {{MetricRegistry}} 
(metric names still need to be tracked).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14914) Add option to disable metrics collection

2020-10-06 Thread Andrzej Bialecki (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki updated SOLR-14914:

Fix Version/s: master (9.0)

> Add option to disable metrics collection
> 
>
> Key: SOLR-14914
> URL: https://issues.apache.org/jira/browse/SOLR-14914
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Major
> Fix For: master (9.0)
>
>
> Some users have expressed concerns about the overhead of metrics collection, 
> and consequently the need to have an option to turn off the metrics 
> collection altogether.
> Metrics instrumentation in Solr cannot be itself easily removed or bypassed - 
> in order to provide fine-grained metrics many code paths had to be changed 
> and they now expect the metrics to be present (non-null). However, we can use 
> the mechanism of {{MetricSupplier}} to use no-op implementation of all 
> metrics, which would reduce the CPU overhead to basically the cost of an 
> empty method call, and the memory overhead to a HashMap entry in a 
> {{MetricRegistry}} (metric names still need to be tracked).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14914) Add option to disable metrics collection

2020-10-06 Thread Andrzej Bialecki (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki updated SOLR-14914:

Fix Version/s: 8.7

> Add option to disable metrics collection
> 
>
> Key: SOLR-14914
> URL: https://issues.apache.org/jira/browse/SOLR-14914
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Major
> Fix For: master (9.0), 8.7
>
>
> Some users have expressed concerns about the overhead of metrics collection, 
> and consequently the need to have an option to turn off the metrics 
> collection altogether.
> Metrics instrumentation in Solr cannot be itself easily removed or bypassed - 
> in order to provide fine-grained metrics many code paths had to be changed 
> and they now expect the metrics to be present (non-null). However, we can use 
> the mechanism of {{MetricSupplier}} to use no-op implementation of all 
> metrics, which would reduce the CPU overhead to basically the cost of an 
> empty method call, and the memory overhead to a HashMap entry in a 
> {{MetricRegistry}} (metric names still need to be tracked).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14914) Add option to disable metrics collection

2020-10-06 Thread Andrzej Bialecki (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki updated SOLR-14914:

Description: 
Some users have expressed concerns about the overhead of metrics collection, 
and consequently the need to have an option to turn off the metrics collection 
altogether.

Metrics instrumentation in Solr cannot be itself easily removed or bypassed - 
in order to provide fine-grained metrics many code paths had to be changed and 
they now expect the metrics to be present (non-null). However, we can use the 
mechanism of {{MetricSupplier}} to provide singleton no-op implementations of 
all metrics, which would reduce the CPU overhead to basically the cost of an 
empty method call, and the memory overhead to a HashMap entry in a 
{{MetricRegistry}} (metric names still need to be tracked).

  was:
Some users have expressed concerns about the overhead of metrics collection, 
and consequently the need to have an option to turn off the metrics collection 
altogether.

Metrics instrumentation in Solr cannot be itself easily removed or bypassed - 
in order to provide fine-grained metrics many code paths had to be changed and 
they now expect the metrics to be present (non-null). However, we can use the 
mechanism of {{MetricSupplier}} to use no-op implementation of all metrics, 
which would reduce the CPU overhead to basically the cost of an empty method 
call, and the memory overhead to a HashMap entry in a {{MetricRegistry}} 
(metric names still need to be tracked).


> Add option to disable metrics collection
> 
>
> Key: SOLR-14914
> URL: https://issues.apache.org/jira/browse/SOLR-14914
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Major
> Fix For: 8.7
>
>
> Some users have expressed concerns about the overhead of metrics collection, 
> and consequently the need to have an option to turn off the metrics 
> collection altogether.
> Metrics instrumentation in Solr cannot be itself easily removed or bypassed - 
> in order to provide fine-grained metrics many code paths had to be changed 
> and they now expect the metrics to be present (non-null). However, we can use 
> the mechanism of {{MetricSupplier}} to provide singleton no-op 
> implementations of all metrics, which would reduce the CPU overhead to 
> basically the cost of an empty method call, and the memory overhead to a 
> HashMap entry in a {{MetricRegistry}} (metric names still need to be tracked).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9564) Format code automatically and enforce it

2020-10-06 Thread Andrzej Bialecki (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208663#comment-17208663
 ] 

Andrzej Bialecki commented on LUCENE-9564:
--

+1 to consistent and enforced formatting.

I'm pretty sure we have quite a few mis-formatted files, probably more in Solr 
than in Lucene, and some devs are notorious for messy formatting - IMHO 
enforcing this wouldn't be such a bad idea, at least after some some initial 
period of leniency (initially we could WARN developers and then switch to 
strict enforcement later).

> Format code automatically and enforce it
> 
>
> Key: LUCENE-9564
> URL: https://issues.apache.org/jira/browse/LUCENE-9564
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is a trivial change but a bold move. And I'm sure it's not for everyone.
> I started using google java format [1] in my projects a while ago and have 
> never looked back since. It is an oracle-style formatter (doesn't allow 
> customizations or deviations from the defined 'ideal') - this takes some 
> getting used to - but it also eliminates *all* the potential differences 
> between IDEs, configs, etc.  And the formatted code typically looks much 
> better than hand-edited one. It is also verifiable on precommit (so you can't 
> commit code that deviates from what you'd get from automated formatting 
> output).
> The biggest benefit I see is that refactorings become such a joy and keep the 
> code neat, everywhere. Before you commit you just reformat everything 
> automatically, no matter how much you messed it up.
> This isn't a change for everyone. I myself love hand-edited, neat code... but 
> the reality is that with IDE support for automated code changes and so many 
> people with different styles working on the same codebase keeping it neat is 
> a big pain. 
> Checkstyle and other tools are fine for ensuring certain rules but they don't 
> take the burden of formatting off your shoulders. This tool does. 
> Like I said - I had *great* reservations about using it at the beginning but 
> over time got so used to it that I almost can't live without it now. It's 
> like magic - you play with the code in any way you like, then run formatting 
> and it's nice and neat.
> The downside is that automated formatting does imply potential merge problems 
> in backward patches (or any currently existing branches).
> Like I said, it is a bold move. Just throwing this for your consideration.
> I've added a PR that adds spotless but it's not ready; some files would have 
> to be excluded as they currently violate header rules.
> A more interesting thing is here where the current code is automatically 
> reformatted, for eyeballing only.
> https://github.com/dweiss/lucene-solr/commit/80e8f14ca61a13781bc812967a9e38cdbe656041



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] noblepaul commented on a change in pull request #1951: SOLR-14691: Reduce object creation by using MapWriter / IteratorWriter.

2020-10-06 Thread GitBox



noblepaul commented on a change in pull request #1951:
URL: https://github.com/apache/lucene-solr/pull/1951#discussion_r500206089



##
File path: solr/core/src/java/org/apache/solr/util/stats/MetricUtils.java
##
@@ -385,10 +406,10 @@ static double nsToMs(boolean convert, double value) {
   }
 
   // some snapshots represent time in ns, other snapshots represent raw values 
(eg. chunk size)
-  static void addSnapshot(Map response, Snapshot snapshot, 
PropertyFilter propertyFilter, boolean ms) {
+  static void addSnapshot(MapWriter.EntryWriter ew, Snapshot snapshot, 
PropertyFilter propertyFilter, boolean ms) {
 BiConsumer filter = (k, v) -> {
   if (propertyFilter.accept(k)) {

Review comment:
   I see this pattern repeated in many places. I think a helper method can 
be added to avoid code duplication





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] noblepaul commented on pull request #1951: SOLR-14691: Reduce object creation by using MapWriter / IteratorWriter.

2020-10-06 Thread GitBox



noblepaul commented on pull request #1951:
URL: https://github.com/apache/lucene-solr/pull/1951#issuecomment-704215473


   Can we get rid of the interface `PropertyFilter` and replace it with a 
`Predicate`?
   
   You can use the class `ConditionalKeyMapWriter` to filter
   
   `MetricUtils.addMetrics()` : Why can't we just have an anonymous class that 
implements `MapWriter` and avoid creation of `NamedList` 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9564) Format code automatically and enforce it

2020-10-06 Thread Erick Erickson (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208679#comment-17208679
 ] 

Erick Erickson commented on LUCENE-9564:


+1 I can adapt to most any style more easily than a dozen different ones.

+1 to having it enforced automatically

I'd recommend it be as close to the current recommendations as possible of 
course.

All at once or piecemeal? My preference would be all at once, someone 
volunteers to just do the entire code base in one massive push. Plus, it's 
easier to say "ignore all diffs in commit 1234, they're all formatting" than 
deal with them one-by-one.

Is there a way to mark something as "leave me alone"? In the diff you linked 
to, Lucene50SkipReader.java, about line 35 there's a comment that's been 
reformatted that's certainly better in the older format. Or is there a way to 
_not_ reformat javadocs, and if so, WDYT about that? This is not a 
deal-breaker, just askin'.

IntelliJ's diff has a nifty "ignore all whitespace" option that would take some 
of the pain out of comparing a file from after the reformatting with one before 
the reformatting. That would help some, how much remains to be seen.

What timing were you thinking? It'd be a little less difficult for people to 
make this coincident with releasing 9.0 as there'd be fewer back-porting 
issues. Again, that's not a deal-breaker.

> Format code automatically and enforce it
> 
>
> Key: LUCENE-9564
> URL: https://issues.apache.org/jira/browse/LUCENE-9564
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is a trivial change but a bold move. And I'm sure it's not for everyone.
> I started using google java format [1] in my projects a while ago and have 
> never looked back since. It is an oracle-style formatter (doesn't allow 
> customizations or deviations from the defined 'ideal') - this takes some 
> getting used to - but it also eliminates *all* the potential differences 
> between IDEs, configs, etc.  And the formatted code typically looks much 
> better than hand-edited one. It is also verifiable on precommit (so you can't 
> commit code that deviates from what you'd get from automated formatting 
> output).
> The biggest benefit I see is that refactorings become such a joy and keep the 
> code neat, everywhere. Before you commit you just reformat everything 
> automatically, no matter how much you messed it up.
> This isn't a change for everyone. I myself love hand-edited, neat code... but 
> the reality is that with IDE support for automated code changes and so many 
> people with different styles working on the same codebase keeping it neat is 
> a big pain. 
> Checkstyle and other tools are fine for ensuring certain rules but they don't 
> take the burden of formatting off your shoulders. This tool does. 
> Like I said - I had *great* reservations about using it at the beginning but 
> over time got so used to it that I almost can't live without it now. It's 
> like magic - you play with the code in any way you like, then run formatting 
> and it's nice and neat.
> The downside is that automated formatting does imply potential merge problems 
> in backward patches (or any currently existing branches).
> Like I said, it is a bold move. Just throwing this for your consideration.
> I've added a PR that adds spotless but it's not ready; some files would have 
> to be excluded as they currently violate header rules.
> A more interesting thing is here where the current code is automatically 
> reformatted, for eyeballing only.
> https://github.com/dweiss/lucene-solr/commit/80e8f14ca61a13781bc812967a9e38cdbe656041



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] noblepaul commented on a change in pull request #1951: SOLR-14691: Reduce object creation by using MapWriter / IteratorWriter.

2020-10-06 Thread GitBox



noblepaul commented on a change in pull request #1951:
URL: https://github.com/apache/lucene-solr/pull/1951#discussion_r500216190



##
File path: 
solr/core/src/test/org/apache/solr/handler/admin/MetricsHandlerTest.java
##
@@ -308,19 +318,24 @@ public void testKeyMetrics() throws Exception {
 handler.handleRequestBody(req(CommonParams.QT, "/admin/metrics", 
CommonParams.WT, "json",
 MetricsHandler.KEY_PARAM, key1, MetricsHandler.KEY_PARAM, key2, 
MetricsHandler.KEY_PARAM, key3), resp);
 values = resp.getValues();
-val = values.findRecursive("metrics", key1);
+map = new HashMap<>();
+values.toMap(map);
+val = Utils.getObjectByPath(map, false, "metrics/" + key1);
 assertNotNull(val);
-val = values.findRecursive("metrics", key2);
+val = Utils.getObjectByPath(map, true, "metrics/" + key2);
 assertNotNull(val);
-val = values.findRecursive("metrics", key3);
+val = Utils.getObjectByPath(map, true, "metrics/" + key3);
 assertNotNull(val);
 
 String key4 = "solr.core.collection1:QUERY./select.requestTimes:1minRate";
 resp = new SolrQueryResponse();
 handler.handleRequestBody(req(CommonParams.QT, "/admin/metrics", 
CommonParams.WT, "json",
 MetricsHandler.KEY_PARAM, key4), resp);
 values = resp.getValues();
-val = values.findRecursive("metrics", key4);
+map = new HashMap<>();

Review comment:
   There is no need to convert this to `Map`.
   
   `values._get()` can fetch nested values





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mayya-sharipova edited a comment on pull request #1943: LUCENE-9555 Advance conjuction Iterator for two phase iteration

2020-10-06 Thread GitBox



mayya-sharipova edited a comment on pull request #1943:
URL: https://github.com/apache/lucene-solr/pull/1943#issuecomment-703885914


   @jpountz Sorry for the noise, I have found the cause of this error, and the 
latest commit addresses it.
   Basically this PR will just address the failing test of 
`TestUnifiedHighlighterStrictPhrases.testBasics`.
   
   My next steps will  the following:
   Plan A:
   - use leap-frog logic without using ConjunctionDISI for this method  (a 
separate PR for that)
   - Then reintroduce again checks in  
[ConjunctionDISI](https://github.com/apache/lucene-solr/pull/1937) that I 
reverted,  as sort optimization by introducing new iterators in the middle of 
iteration would break these checks.   
   
   Plan B:
   - continue to use ConjunctionDISI to combine scorerIterator and 
collectorIterator
   - from checks in  
[ConjunctionDISI](https://github.com/apache/lucene-solr/pull/1937) keep only 
the check on the same doc during constructor and remove all other checks.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mayya-sharipova edited a comment on pull request #1943: LUCENE-9555 Advance conjuction Iterator for two phase iteration

2020-10-06 Thread GitBox



mayya-sharipova edited a comment on pull request #1943:
URL: https://github.com/apache/lucene-solr/pull/1943#issuecomment-703885914


   @jpountz Sorry for the noise, I have found the cause of this error, and the 
latest commit addresses it.
   Basically this PR will just address the failing test of 
`TestUnifiedHighlighterStrictPhrases.testBasics`.
   
   My next steps will  the following:
   Plan A:
   - use leap-frog logic without using ConjunctionDISI for this method  (a 
separate PR for that)
   - Then reintroduce again checks in  
[ConjunctionDISI](https://github.com/apache/lucene-solr/pull/1937) that I 
reverted
   
   Plan B:
   - continue to use ConjunctionDISI to combine scorerIterator and 
collectorIterator
   - from checks in  
[ConjunctionDISI](https://github.com/apache/lucene-solr/pull/1937) keep only 
the check on the same doc during constructor and remove all other checks,  as 
sort optimization by introducing new iterators in the middle of iteration would 
break these checks.   
   
   I am interested in your opinion which plan is better?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mayya-sharipova edited a comment on pull request #1943: LUCENE-9555 Advance conjuction Iterator for two phase iteration

2020-10-06 Thread GitBox



mayya-sharipova edited a comment on pull request #1943:
URL: https://github.com/apache/lucene-solr/pull/1943#issuecomment-703885914


   @jpountz Sorry for the noise, I have found the cause of this error, and the 
latest commit addresses it.
   Basically this PR will just address the failing test of 
`TestUnifiedHighlighterStrictPhrases.testBasics`.
   
   My next steps will  the following:
   Plan A:
   - use leap-frog logic without using ConjunctionDISI for this method  (a 
separate PR for that)
   - Then reintroduce again checks in  
[ConjunctionDISI](https://github.com/apache/lucene-solr/pull/1937) that I 
reverted
   
   Plan B:
   - continue to use ConjunctionDISI to combine scorerIterator and 
collectorIterator
   - from checks in  
[ConjunctionDISI](https://github.com/apache/lucene-solr/pull/1937) keep only 
the check on the same doc during constructor and remove all other checks,  as 
sort optimization by introducing new DISI (from -1 doc) in the middle of 
iteration would break these checks.   
   
   I am interested in your opinion which plan is better?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mayya-sharipova edited a comment on pull request #1943: LUCENE-9555 Advance conjuction Iterator for two phase iteration

2020-10-06 Thread GitBox



mayya-sharipova edited a comment on pull request #1943:
URL: https://github.com/apache/lucene-solr/pull/1943#issuecomment-703885914


   @jpountz Sorry for the noise, I have found the cause of this error, and the 
latest commit addresses it.
   Basically this PR will just address the failing test of 
`TestUnifiedHighlighterStrictPhrases.testBasics`.
   
   My next steps will  the following:
   Plan A:
   - use leap-frog logic without using ConjunctionDISI for this method  (a 
separate PR for that)
   - Then reintroduce again checks in  
[ConjunctionDISI](https://github.com/apache/lucene-solr/pull/1937) that I 
reverted
   
   Plan B:
   - continue to use ConjunctionDISI to combine scorerIterator and 
collectorIterator
   - from checks in  
[ConjunctionDISI](https://github.com/apache/lucene-solr/pull/1937) keep only 
the check of subiterators the same doc during constructor and remove all other 
checks,  as sort optimization by introducing new DISI (from -1 doc) in the 
middle of iteration would break these checks.   
   
   I am interested in your opinion which plan is better?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] noblepaul commented on pull request #1951: SOLR-14691: Reduce object creation by using MapWriter / IteratorWriter.

2020-10-06 Thread GitBox



noblepaul commented on pull request #1951:
URL: https://github.com/apache/lucene-solr/pull/1951#issuecomment-704237134


   Is it possible to change  `MetricsMap implements Gauge>` 
to 
   
   ` MetricsMap implements Gauge`
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9564) Format code automatically and enforce it

2020-10-06 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208702#comment-17208702
 ] 

Dawid Weiss commented on LUCENE-9564:
-

> I'd recommend it be as close to the current recommendations as possible of 
> course.

Like I said, it's not possible - this is not a configurable format, you accept 
it as it is.

> Is there a way to mark something as "leave me alone"?

No, there isn't (on purpose). Javadocs - I'm not sure but again - when you look 
at the diff, seems better formatted than not (and those places that are screwed 
up typically required pre tag anyway.

You can exclude entire files from this check - this could be applied to 
generated files... although it really doesn't matter much if you do format them 
(they'll just look nicer) because they'll be consistently cleaned up after 
regeneration.

> Format code automatically and enforce it
> 
>
> Key: LUCENE-9564
> URL: https://issues.apache.org/jira/browse/LUCENE-9564
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is a trivial change but a bold move. And I'm sure it's not for everyone.
> I started using google java format [1] in my projects a while ago and have 
> never looked back since. It is an oracle-style formatter (doesn't allow 
> customizations or deviations from the defined 'ideal') - this takes some 
> getting used to - but it also eliminates *all* the potential differences 
> between IDEs, configs, etc.  And the formatted code typically looks much 
> better than hand-edited one. It is also verifiable on precommit (so you can't 
> commit code that deviates from what you'd get from automated formatting 
> output).
> The biggest benefit I see is that refactorings become such a joy and keep the 
> code neat, everywhere. Before you commit you just reformat everything 
> automatically, no matter how much you messed it up.
> This isn't a change for everyone. I myself love hand-edited, neat code... but 
> the reality is that with IDE support for automated code changes and so many 
> people with different styles working on the same codebase keeping it neat is 
> a big pain. 
> Checkstyle and other tools are fine for ensuring certain rules but they don't 
> take the burden of formatting off your shoulders. This tool does. 
> Like I said - I had *great* reservations about using it at the beginning but 
> over time got so used to it that I almost can't live without it now. It's 
> like magic - you play with the code in any way you like, then run formatting 
> and it's nice and neat.
> The downside is that automated formatting does imply potential merge problems 
> in backward patches (or any currently existing branches).
> Like I said, it is a bold move. Just throwing this for your consideration.
> I've added a PR that adds spotless but it's not ready; some files would have 
> to be excluded as they currently violate header rules.
> A more interesting thing is here where the current code is automatically 
> reformatted, for eyeballing only.
> https://github.com/dweiss/lucene-solr/commit/80e8f14ca61a13781bc812967a9e38cdbe656041



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9564) Format code automatically and enforce it

2020-10-06 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208720#comment-17208720
 ] 

Dawid Weiss commented on LUCENE-9564:
-

Google's style guide:
https://google.github.io/styleguide/javaguide.html

Formatter itself (interesting code):
https://github.com/google/google-java-format

> Format code automatically and enforce it
> 
>
> Key: LUCENE-9564
> URL: https://issues.apache.org/jira/browse/LUCENE-9564
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is a trivial change but a bold move. And I'm sure it's not for everyone.
> I started using google java format [1] in my projects a while ago and have 
> never looked back since. It is an oracle-style formatter (doesn't allow 
> customizations or deviations from the defined 'ideal') - this takes some 
> getting used to - but it also eliminates *all* the potential differences 
> between IDEs, configs, etc.  And the formatted code typically looks much 
> better than hand-edited one. It is also verifiable on precommit (so you can't 
> commit code that deviates from what you'd get from automated formatting 
> output).
> The biggest benefit I see is that refactorings become such a joy and keep the 
> code neat, everywhere. Before you commit you just reformat everything 
> automatically, no matter how much you messed it up.
> This isn't a change for everyone. I myself love hand-edited, neat code... but 
> the reality is that with IDE support for automated code changes and so many 
> people with different styles working on the same codebase keeping it neat is 
> a big pain. 
> Checkstyle and other tools are fine for ensuring certain rules but they don't 
> take the burden of formatting off your shoulders. This tool does. 
> Like I said - I had *great* reservations about using it at the beginning but 
> over time got so used to it that I almost can't live without it now. It's 
> like magic - you play with the code in any way you like, then run formatting 
> and it's nice and neat.
> The downside is that automated formatting does imply potential merge problems 
> in backward patches (or any currently existing branches).
> Like I said, it is a bold move. Just throwing this for your consideration.
> I've added a PR that adds spotless but it's not ready; some files would have 
> to be excluded as they currently violate header rules.
> A more interesting thing is here where the current code is automatically 
> reformatted, for eyeballing only.
> https://github.com/dweiss/lucene-solr/commit/80e8f14ca61a13781bc812967a9e38cdbe656041
> [1] https://google.github.io/styleguide/javaguide.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-9564) Format code automatically and enforce it

2020-10-06 Thread Dawid Weiss (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated LUCENE-9564:

Description: 
This is a trivial change but a bold move. And I'm sure it's not for everyone.

I started using google java format [1] in my projects a while ago and have 
never looked back since. It is an oracle-style formatter (doesn't allow 
customizations or deviations from the defined 'ideal') - this takes some 
getting used to - but it also eliminates *all* the potential differences 
between IDEs, configs, etc.  And the formatted code typically looks much better 
than hand-edited one. It is also verifiable on precommit (so you can't commit 
code that deviates from what you'd get from automated formatting output).

The biggest benefit I see is that refactorings become such a joy and keep the 
code neat, everywhere. Before you commit you just reformat everything 
automatically, no matter how much you messed it up.

This isn't a change for everyone. I myself love hand-edited, neat code... but 
the reality is that with IDE support for automated code changes and so many 
people with different styles working on the same codebase keeping it neat is a 
big pain. 

Checkstyle and other tools are fine for ensuring certain rules but they don't 
take the burden of formatting off your shoulders. This tool does. 

Like I said - I had *great* reservations about using it at the beginning but 
over time got so used to it that I almost can't live without it now. It's like 
magic - you play with the code in any way you like, then run formatting and 
it's nice and neat.

The downside is that automated formatting does imply potential merge problems 
in backward patches (or any currently existing branches).

Like I said, it is a bold move. Just throwing this for your consideration.

I've added a PR that adds spotless but it's not ready; some files would have to 
be excluded as they currently violate header rules.

A more interesting thing is here where the current code is automatically 
reformatted, for eyeballing only.

https://github.com/dweiss/lucene-solr/commit/80e8f14ca61a13781bc812967a9e38cdbe656041

[1] https://google.github.io/styleguide/javaguide.html

  was:
This is a trivial change but a bold move. And I'm sure it's not for everyone.

I started using google java format [1] in my projects a while ago and have 
never looked back since. It is an oracle-style formatter (doesn't allow 
customizations or deviations from the defined 'ideal') - this takes some 
getting used to - but it also eliminates *all* the potential differences 
between IDEs, configs, etc.  And the formatted code typically looks much better 
than hand-edited one. It is also verifiable on precommit (so you can't commit 
code that deviates from what you'd get from automated formatting output).

The biggest benefit I see is that refactorings become such a joy and keep the 
code neat, everywhere. Before you commit you just reformat everything 
automatically, no matter how much you messed it up.

This isn't a change for everyone. I myself love hand-edited, neat code... but 
the reality is that with IDE support for automated code changes and so many 
people with different styles working on the same codebase keeping it neat is a 
big pain. 

Checkstyle and other tools are fine for ensuring certain rules but they don't 
take the burden of formatting off your shoulders. This tool does. 

Like I said - I had *great* reservations about using it at the beginning but 
over time got so used to it that I almost can't live without it now. It's like 
magic - you play with the code in any way you like, then run formatting and 
it's nice and neat.

The downside is that automated formatting does imply potential merge problems 
in backward patches (or any currently existing branches).

Like I said, it is a bold move. Just throwing this for your consideration.

I've added a PR that adds spotless but it's not ready; some files would have to 
be excluded as they currently violate header rules.

A more interesting thing is here where the current code is automatically 
reformatted, for eyeballing only.

https://github.com/dweiss/lucene-solr/commit/80e8f14ca61a13781bc812967a9e38cdbe656041


> Format code automatically and enforce it
> 
>
> Key: LUCENE-9564
> URL: https://issues.apache.org/jira/browse/LUCENE-9564
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is a trivial change but a bold move. And I'm sure it's not for everyone.
> I started using google java format [1] in my projects a while ago and have 
> never looked back since. It is an oracle-style formatter (doesn't allow 
> customizations or deviations from the defined 'idea

[jira] [Updated] (LUCENE-9555) Advance conjunction Iterator for two phase iteration

2020-10-06 Thread Mayya Sharipova (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayya Sharipova updated LUCENE-9555:

Summary: Advance conjunction Iterator for two phase iteration  (was: Sort 
optimization failure if scorerIterator is already advanced)

> Advance conjunction Iterator for two phase iteration
> 
>
> Key: LUCENE-9555
> URL: https://issues.apache.org/jira/browse/LUCENE-9555
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Mayya Sharipova
>Priority: Minor
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Some collectors provide iterators that can efficiently skip non-competitive 
> docs. When using DefaultBulkScorer#score function we create a conjunction of 
> scorerIterator and collectorIterator. The problem could be if scorerIterator 
> has already been advanced. As collectorIterator always starts from a docID = 
> -1, and for creation of conjunction iterator we need all of its
>  sub-iterators to be on the same doc, the creation of conjunction iterator 
> will fail.
> We need to create a conjunction between scorerIterator and collectorIterator 
> only if scorerIterator has not been advanced yet. 
> Relates to https://issues.apache.org/jira/browse/LUCENE-9280
> Relates to https://issues.apache.org/jira/browse/LUCENE-9541
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-9555) Advance conjunction Iterator for two phase iteration

2020-10-06 Thread Mayya Sharipova (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayya Sharipova updated LUCENE-9555:

Description: 
LUCENE-9280 introduced a sort optimization where
documents can be skipped.
But there was a bug in case we were using two phase
approximation, as we would advance it without advancing
an overall conjunction iterator. 

Relates to https://issues.apache.org/jira/browse/LUCENE-9280 

 

  was:
Some collectors provide iterators that can efficiently skip non-competitive 
docs. When using DefaultBulkScorer#score function we create a conjunction of 
scorerIterator and collectorIterator. The problem could be if scorerIterator 
has already been advanced. As collectorIterator always starts from a docID = 
-1, and for creation of conjunction iterator we need all of its
 sub-iterators to be on the same doc, the creation of conjunction iterator will 
fail.

We need to create a conjunction between scorerIterator and collectorIterator 
only if scorerIterator has not been advanced yet. 

Relates to https://issues.apache.org/jira/browse/LUCENE-9280

Relates to https://issues.apache.org/jira/browse/LUCENE-9541

 

 


> Advance conjunction Iterator for two phase iteration
> 
>
> Key: LUCENE-9555
> URL: https://issues.apache.org/jira/browse/LUCENE-9555
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Mayya Sharipova
>Priority: Minor
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> LUCENE-9280 introduced a sort optimization where
> documents can be skipped.
> But there was a bug in case we were using two phase
> approximation, as we would advance it without advancing
> an overall conjunction iterator. 
> Relates to https://issues.apache.org/jira/browse/LUCENE-9280 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9564) Format code automatically and enforce it

2020-10-06 Thread Erick Erickson (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208739#comment-17208739
 ] 

Erick Erickson commented on LUCENE-9564:


Ah, so you're saying that the bits I pointed out would be left alone if they 
have a  tag? That fixes that problem if so.

> Format code automatically and enforce it
> 
>
> Key: LUCENE-9564
> URL: https://issues.apache.org/jira/browse/LUCENE-9564
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is a trivial change but a bold move. And I'm sure it's not for everyone.
> I started using google java format [1] in my projects a while ago and have 
> never looked back since. It is an oracle-style formatter (doesn't allow 
> customizations or deviations from the defined 'ideal') - this takes some 
> getting used to - but it also eliminates *all* the potential differences 
> between IDEs, configs, etc.  And the formatted code typically looks much 
> better than hand-edited one. It is also verifiable on precommit (so you can't 
> commit code that deviates from what you'd get from automated formatting 
> output).
> The biggest benefit I see is that refactorings become such a joy and keep the 
> code neat, everywhere. Before you commit you just reformat everything 
> automatically, no matter how much you messed it up.
> This isn't a change for everyone. I myself love hand-edited, neat code... but 
> the reality is that with IDE support for automated code changes and so many 
> people with different styles working on the same codebase keeping it neat is 
> a big pain. 
> Checkstyle and other tools are fine for ensuring certain rules but they don't 
> take the burden of formatting off your shoulders. This tool does. 
> Like I said - I had *great* reservations about using it at the beginning but 
> over time got so used to it that I almost can't live without it now. It's 
> like magic - you play with the code in any way you like, then run formatting 
> and it's nice and neat.
> The downside is that automated formatting does imply potential merge problems 
> in backward patches (or any currently existing branches).
> Like I said, it is a bold move. Just throwing this for your consideration.
> I've added a PR that adds spotless but it's not ready; some files would have 
> to be excluded as they currently violate header rules.
> A more interesting thing is here where the current code is automatically 
> reformatted, for eyeballing only.
> https://github.com/dweiss/lucene-solr/commit/80e8f14ca61a13781bc812967a9e38cdbe656041
> [1] https://google.github.io/styleguide/javaguide.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9555) Advance conjunction Iterator for two phase iteration

2020-10-06 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208741#comment-17208741
 ] 

ASF subversion and git services commented on LUCENE-9555:
-

Commit 6ac94a6f9fba724b00bc886be2b1620d8db47d83 in lucene-solr's branch 
refs/heads/master from Mayya Sharipova
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=6ac94a6 ]

LUCENE-9555: Advance conjunction Iterator for two phase iteration (#1943)

PR #1351 introduced a sort optimization where
documents can be skipped.
But there was a bug in case we were using two phase
approximation, as we would advance it without advancing
an overall conjunction iterator.

This patch fixed it.

Relates to #1351

> Advance conjunction Iterator for two phase iteration
> 
>
> Key: LUCENE-9555
> URL: https://issues.apache.org/jira/browse/LUCENE-9555
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Mayya Sharipova
>Priority: Minor
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> LUCENE-9280 introduced a sort optimization where
> documents can be skipped.
> But there was a bug in case we were using two phase
> approximation, as we would advance it without advancing
> an overall conjunction iterator. 
> Relates to https://issues.apache.org/jira/browse/LUCENE-9280 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9564) Format code automatically and enforce it

2020-10-06 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208763#comment-17208763
 ] 

Dawid Weiss commented on LUCENE-9564:
-

Ideally, they should be wrapped in code blocks too -- see here:
https://reflectoring.io/howto-format-code-snippets-in-javadoc/#pre--code

> Format code automatically and enforce it
> 
>
> Key: LUCENE-9564
> URL: https://issues.apache.org/jira/browse/LUCENE-9564
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is a trivial change but a bold move. And I'm sure it's not for everyone.
> I started using google java format [1] in my projects a while ago and have 
> never looked back since. It is an oracle-style formatter (doesn't allow 
> customizations or deviations from the defined 'ideal') - this takes some 
> getting used to - but it also eliminates *all* the potential differences 
> between IDEs, configs, etc.  And the formatted code typically looks much 
> better than hand-edited one. It is also verifiable on precommit (so you can't 
> commit code that deviates from what you'd get from automated formatting 
> output).
> The biggest benefit I see is that refactorings become such a joy and keep the 
> code neat, everywhere. Before you commit you just reformat everything 
> automatically, no matter how much you messed it up.
> This isn't a change for everyone. I myself love hand-edited, neat code... but 
> the reality is that with IDE support for automated code changes and so many 
> people with different styles working on the same codebase keeping it neat is 
> a big pain. 
> Checkstyle and other tools are fine for ensuring certain rules but they don't 
> take the burden of formatting off your shoulders. This tool does. 
> Like I said - I had *great* reservations about using it at the beginning but 
> over time got so used to it that I almost can't live without it now. It's 
> like magic - you play with the code in any way you like, then run formatting 
> and it's nice and neat.
> The downside is that automated formatting does imply potential merge problems 
> in backward patches (or any currently existing branches).
> Like I said, it is a bold move. Just throwing this for your consideration.
> I've added a PR that adds spotless but it's not ready; some files would have 
> to be excluded as they currently violate header rules.
> A more interesting thing is here where the current code is automatically 
> reformatted, for eyeballing only.
> https://github.com/dweiss/lucene-solr/commit/80e8f14ca61a13781bc812967a9e38cdbe656041
> [1] https://google.github.io/styleguide/javaguide.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9555) Advance conjunction Iterator for two phase iteration

2020-10-06 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208752#comment-17208752
 ] 

ASF subversion and git services commented on LUCENE-9555:
-

Commit e7bf2dc8b324b65b384904959284aef9630ef761 in lucene-solr's branch 
refs/heads/branch_8x from Mayya Sharipova
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=e7bf2dc ]

LUCENE-9555: Advance conjunction Iterator for two phase iteration (#1943)

PR #1351 introduced a sort optimization where
documents can be skipped.
But there was a bug in case we were using two phase
approximation, as we would advance it without advancing
an overall conjunction iterator.

This patch fixed it.

Relates to #1351

> Advance conjunction Iterator for two phase iteration
> 
>
> Key: LUCENE-9555
> URL: https://issues.apache.org/jira/browse/LUCENE-9555
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Mayya Sharipova
>Priority: Minor
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> LUCENE-9280 introduced a sort optimization where
> documents can be skipped.
> But there was a bug in case we were using two phase
> approximation, as we would advance it without advancing
> an overall conjunction iterator. 
> Relates to https://issues.apache.org/jira/browse/LUCENE-9280 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9565) Fix iteration over competitive iterators

2020-10-06 Thread Mayya Sharipova (Jira)

Mayya Sharipova created LUCENE-9565:
---

 Summary: Fix iteration over competitive iterators
 Key: LUCENE-9565
 URL: https://issues.apache.org/jira/browse/LUCENE-9565
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Mayya Sharipova


LUCENE-9280 introduced a sort optimization where documents can be skipped.
But iteration over competitive iterators was not properly organized,
as they were not storing the current docID, and
when competitive iterator was updated the current doc ID was lost.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mayya-sharipova opened a new pull request #1952: LUCENE-9565 Fix competitive iteration

2020-10-06 Thread GitBox



mayya-sharipova opened a new pull request #1952:
URL: https://github.com/apache/lucene-solr/pull/1952


   PR #1351 introduced a sort optimization where documents can be skipped.
   But iteration over competitive iterators was not properly organized,
   as they were not storing the current docID, and
   when competitive iterator was updated the current doc ID was lost.
   
   This patch fixed it.
   
   Relates to #1351



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mayya-sharipova commented on pull request #1952: LUCENE-9565 Fix competitive iteration

2020-10-06 Thread GitBox



mayya-sharipova commented on pull request #1952:
URL: https://github.com/apache/lucene-solr/pull/1952#issuecomment-704311136


   This patch works well with checks in  
[ConjunctionDISI](https://github.com/apache/lucene-solr/pull/1937), and all 
tests pass as well.
   
   After we merge this PR, we can reintroduce the checks in ConjunctionDISI.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on a change in pull request #1952: LUCENE-9565 Fix competitive iteration

2020-10-06 Thread GitBox



jpountz commented on a change in pull request #1952:
URL: https://github.com/apache/lucene-solr/pull/1952#discussion_r500372658



##
File path: lucene/core/src/java/org/apache/lucene/search/Weight.java
##
@@ -204,9 +204,14 @@ public int score(LeafCollector collector, Bits acceptDocs, 
int min, int max) thr
   collector.setScorer(scorer);
   DocIdSetIterator scorerIterator = twoPhase == null ? iterator : 
twoPhase.approximation();
   DocIdSetIterator collectorIterator = collector.competitiveIterator();
-  // if possible filter scorerIterator to keep only competitive docs as 
defined by collector
-  DocIdSetIterator filteredIterator = collectorIterator == null ? 
scorerIterator :
-  ConjunctionDISI.intersectIterators(Arrays.asList(scorerIterator, 
collectorIterator));
+  DocIdSetIterator filteredIterator = scorerIterator;
+  if (collectorIterator != null) {
+if (scorerIterator.docID() != -1) {
+  collectorIterator.advance(scorerIterator.docID());
+}

Review comment:
   I don't think that this is good enough as we might be advancing ahead of 
scorerIterator? This was why I thought that we should instead wrap 
scorerIterator in such a way that its initial docID would be -1.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9564) Format code automatically and enforce it

2020-10-06 Thread Varun Thacker (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208840#comment-17208840
 ] 

Varun Thacker commented on LUCENE-9564:
---

+1 to consistent and enforced formatting.

 

The one downside is git blame will now show the commit that reformatted the 
code in all these files.

> Format code automatically and enforce it
> 
>
> Key: LUCENE-9564
> URL: https://issues.apache.org/jira/browse/LUCENE-9564
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is a trivial change but a bold move. And I'm sure it's not for everyone.
> I started using google java format [1] in my projects a while ago and have 
> never looked back since. It is an oracle-style formatter (doesn't allow 
> customizations or deviations from the defined 'ideal') - this takes some 
> getting used to - but it also eliminates *all* the potential differences 
> between IDEs, configs, etc.  And the formatted code typically looks much 
> better than hand-edited one. It is also verifiable on precommit (so you can't 
> commit code that deviates from what you'd get from automated formatting 
> output).
> The biggest benefit I see is that refactorings become such a joy and keep the 
> code neat, everywhere. Before you commit you just reformat everything 
> automatically, no matter how much you messed it up.
> This isn't a change for everyone. I myself love hand-edited, neat code... but 
> the reality is that with IDE support for automated code changes and so many 
> people with different styles working on the same codebase keeping it neat is 
> a big pain. 
> Checkstyle and other tools are fine for ensuring certain rules but they don't 
> take the burden of formatting off your shoulders. This tool does. 
> Like I said - I had *great* reservations about using it at the beginning but 
> over time got so used to it that I almost can't live without it now. It's 
> like magic - you play with the code in any way you like, then run formatting 
> and it's nice and neat.
> The downside is that automated formatting does imply potential merge problems 
> in backward patches (or any currently existing branches).
> Like I said, it is a bold move. Just throwing this for your consideration.
> I've added a PR that adds spotless but it's not ready; some files would have 
> to be excluded as they currently violate header rules.
> A more interesting thing is here where the current code is automatically 
> reformatted, for eyeballing only.
> https://github.com/dweiss/lucene-solr/commit/80e8f14ca61a13781bc812967a9e38cdbe656041
> [1] https://google.github.io/styleguide/javaguide.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9564) Format code automatically and enforce it

2020-10-06 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208845#comment-17208845
 ] 

Dawid Weiss commented on LUCENE-9564:
-

Yes, you're correct. I don't know if there is a way around this - very likely 
no. It's a trivial change but it does have big side effects (and pros and cons).

We don't have to do it, I just wanted to let you know about this option. I've 
been *really* happy with this automated-code-formatting workflow and it's been 
a few months now. 

> Format code automatically and enforce it
> 
>
> Key: LUCENE-9564
> URL: https://issues.apache.org/jira/browse/LUCENE-9564
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is a trivial change but a bold move. And I'm sure it's not for everyone.
> I started using google java format [1] in my projects a while ago and have 
> never looked back since. It is an oracle-style formatter (doesn't allow 
> customizations or deviations from the defined 'ideal') - this takes some 
> getting used to - but it also eliminates *all* the potential differences 
> between IDEs, configs, etc.  And the formatted code typically looks much 
> better than hand-edited one. It is also verifiable on precommit (so you can't 
> commit code that deviates from what you'd get from automated formatting 
> output).
> The biggest benefit I see is that refactorings become such a joy and keep the 
> code neat, everywhere. Before you commit you just reformat everything 
> automatically, no matter how much you messed it up.
> This isn't a change for everyone. I myself love hand-edited, neat code... but 
> the reality is that with IDE support for automated code changes and so many 
> people with different styles working on the same codebase keeping it neat is 
> a big pain. 
> Checkstyle and other tools are fine for ensuring certain rules but they don't 
> take the burden of formatting off your shoulders. This tool does. 
> Like I said - I had *great* reservations about using it at the beginning but 
> over time got so used to it that I almost can't live without it now. It's 
> like magic - you play with the code in any way you like, then run formatting 
> and it's nice and neat.
> The downside is that automated formatting does imply potential merge problems 
> in backward patches (or any currently existing branches).
> Like I said, it is a bold move. Just throwing this for your consideration.
> I've added a PR that adds spotless but it's not ready; some files would have 
> to be excluded as they currently violate header rules.
> A more interesting thing is here where the current code is automatically 
> reformatted, for eyeballing only.
> https://github.com/dweiss/lucene-solr/commit/80e8f14ca61a13781bc812967a9e38cdbe656041
> [1] https://google.github.io/styleguide/javaguide.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-9564) Format code automatically and enforce it

2020-10-06 Thread Dawid Weiss (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated LUCENE-9564:

Description: 
This is a trivial change but a bold move. And I'm sure it's not for everyone.

I started using google java format [1] in my projects a while ago and have 
never looked back since. It is an oracle-style formatter (doesn't allow 
customizations or deviations from the defined 'ideal') - this takes some 
getting used to - but it also eliminates *all* the potential differences 
between IDEs, configs, etc.  And the formatted code typically looks much better 
than hand-edited one. It is also verifiable on precommit (so you can't commit 
code that deviates from what you'd get from automated formatting output).

The biggest benefit I see is that refactorings become such a joy and keep the 
code neat, everywhere. Before you commit you just reformat everything 
automatically, no matter how much you messed it up.

This isn't a change for everyone. I myself love hand-edited, neat code... but 
the reality is that with IDE support for automated code changes and so many 
people with different styles working on the same codebase keeping it neat is a 
big pain. 

Checkstyle and other tools are fine for ensuring certain rules but they don't 
take the burden of formatting off your shoulders. This tool does. 

Like I said - I had *great* reservations about using it at the beginning but 
over time got so used to it that I almost can't live without it now. It's like 
magic - you play with the code in any way you like, then run formatting and 
it's nice and neat.

The downside is that automated formatting does imply potential merge problems 
in backward patches (or any currently existing branches).

Like I said, it is a bold move. Just throwing this for your consideration.

-I've added a PR that adds spotless but it's not ready; some files would have 
to be excluded as they currently violate header rules.-

A more interesting thing is here where the current code is automatically 
reformatted - this branch is for eyeballing only.

https://github.com/dweiss/lucene-solr/compare/LUCENE-9564...dweiss:LUCENE-9564-example

[1] https://google.github.io/styleguide/javaguide.html

  was:
This is a trivial change but a bold move. And I'm sure it's not for everyone.

I started using google java format [1] in my projects a while ago and have 
never looked back since. It is an oracle-style formatter (doesn't allow 
customizations or deviations from the defined 'ideal') - this takes some 
getting used to - but it also eliminates *all* the potential differences 
between IDEs, configs, etc.  And the formatted code typically looks much better 
than hand-edited one. It is also verifiable on precommit (so you can't commit 
code that deviates from what you'd get from automated formatting output).

The biggest benefit I see is that refactorings become such a joy and keep the 
code neat, everywhere. Before you commit you just reformat everything 
automatically, no matter how much you messed it up.

This isn't a change for everyone. I myself love hand-edited, neat code... but 
the reality is that with IDE support for automated code changes and so many 
people with different styles working on the same codebase keeping it neat is a 
big pain. 

Checkstyle and other tools are fine for ensuring certain rules but they don't 
take the burden of formatting off your shoulders. This tool does. 

Like I said - I had *great* reservations about using it at the beginning but 
over time got so used to it that I almost can't live without it now. It's like 
magic - you play with the code in any way you like, then run formatting and 
it's nice and neat.

The downside is that automated formatting does imply potential merge problems 
in backward patches (or any currently existing branches).

Like I said, it is a bold move. Just throwing this for your consideration.

I've added a PR that adds spotless but it's not ready; some files would have to 
be excluded as they currently violate header rules.

A more interesting thing is here where the current code is automatically 
reformatted, for eyeballing only.

https://github.com/dweiss/lucene-solr/commit/80e8f14ca61a13781bc812967a9e38cdbe656041

[1] https://google.github.io/styleguide/javaguide.html


> Format code automatically and enforce it
> 
>
> Key: LUCENE-9564
> URL: https://issues.apache.org/jira/browse/LUCENE-9564
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is a trivial change but a bold move. And I'm sure it's not for everyone.
> I started using google java format [1] in my projects a while ago and have 
> never looked back since. It is an oracle-style form

[jira] [Commented] (SOLR-14870) gradle build does not validate ref-guide -> javadoc links

2020-10-06 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208856#comment-17208856
 ] 

Dawid Weiss commented on SOLR-14870:


I took a peek. Is there any particular reason why you wanted to make this 
PrepareSources a java task and not just use DSL, Chris?

> gradle build does not validate ref-guide -> javadoc links
> -
>
> Key: SOLR-14870
> URL: https://issues.apache.org/jira/browse/SOLR-14870
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-14870.patch
>
>
> the ant build had (has on 8x) a feature that ensured we didn't have any 
> broken links between the ref guide and the javadocs...
> {code}
>  depends="javadocs,changes-to-html,process-webpages">
>  inheritall="false">
>   
>   
> 
>   
> {code}
> ...by default {{cd solr/solr-ref-guide && ant bare-bones-html-validation}} 
> just did interanal validation of the strucure of the guide, but this hook 
> ment that {{cd solr && ant documentation}} (or {{ant precommit}}) would first 
> build the javadocs; then build the ref-guide; then validate _all_ links i 
> nthe ref-guide, even those to (local) javadocs
> While the "local.javadocs" property logic _inside_ the 
> solr-ref-guide/build.xml was ported to build.gradle, the logic to leverage 
> this functionality from the "solr" project doesn't seem to have been 
> preserved -- so currently, {{gradle check}} doesn't know/care if someone adds 
> a nonsense javadoc link to the ref-guide (or removes a class/method whose 
> javadoc is already currently to from the ref guide)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mayya-sharipova merged pull request #1943: LUCENE-9555: Advance conjuction Iterator for two phase iteration

2020-10-06 Thread GitBox



mayya-sharipova merged pull request #1943:
URL: https://github.com/apache/lucene-solr/pull/1943


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14066) Deprecate DIH and migrate to a community supported package

2020-10-06 Thread Rui Pimentel (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208861#comment-17208861
 ] 

Rui Pimentel commented on SOLR-14066:
-

Hi Erich Siffert,

Did you try with Oracle

> Deprecate DIH and migrate to a community supported package
> --
>
> Key: SOLR-14066
> URL: https://issues.apache.org/jira/browse/SOLR-14066
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Blocker
> Fix For: 8.6
>
> Attachments: image-2019-12-14-19-58-39-314.png
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> DIH doesn't need to remain inside Solr anymore. Plan is to deprecate DIH in 
> 8.6, remove from 9.0. A community supported version of DIH (which can be used 
> with Solr's package manager) can be found here 
> https://github.com/rohitbemax/dataimporthandler.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mayya-sharipova commented on a change in pull request #1952: LUCENE-9565 Fix competitive iteration

2020-10-06 Thread GitBox



mayya-sharipova commented on a change in pull request #1952:
URL: https://github.com/apache/lucene-solr/pull/1952#discussion_r500412063



##
File path: lucene/core/src/java/org/apache/lucene/search/Weight.java
##
@@ -204,9 +204,14 @@ public int score(LeafCollector collector, Bits acceptDocs, 
int min, int max) thr
   collector.setScorer(scorer);
   DocIdSetIterator scorerIterator = twoPhase == null ? iterator : 
twoPhase.approximation();
   DocIdSetIterator collectorIterator = collector.competitiveIterator();
-  // if possible filter scorerIterator to keep only competitive docs as 
defined by collector
-  DocIdSetIterator filteredIterator = collectorIterator == null ? 
scorerIterator :
-  ConjunctionDISI.intersectIterators(Arrays.asList(scorerIterator, 
collectorIterator));
+  DocIdSetIterator filteredIterator = scorerIterator;
+  if (collectorIterator != null) {
+if (scorerIterator.docID() != -1) {
+  collectorIterator.advance(scorerIterator.docID());
+}

Review comment:
   @jpountz  Addressed in cba6cf75f48a26cd48f09152c25518c91fd6660d





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14870) gradle build does not validate ref-guide -> javadoc links

2020-10-06 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208864#comment-17208864
 ] 

Dawid Weiss commented on SOLR-14870:


Perhaps I should clarify that if the intention was to pull out common 
configuration and logic then it'd be more natural to define multiple tasks and 
just run the same configuration closure over all these tasks (that handles 
common setup). We do use it in a few places, like in 
solr-forbidden-apis.gradle, for example:
{code}
configure(project(":solr:core")) {
  tasks.matching { it.name == "forbiddenApisMain" || it.name == 
"forbiddenApisTest" }.all {
exclude "org/apache/solr/internal/**"
exclude "org/apache/hadoop/**"
  }
}
{code}

this applies the same configuration to two tasks (selected by name). You could 
just as well do something like:

configure([task1, task2]) {
  // common setup.
}

and it'd work too.

> gradle build does not validate ref-guide -> javadoc links
> -
>
> Key: SOLR-14870
> URL: https://issues.apache.org/jira/browse/SOLR-14870
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-14870.patch
>
>
> the ant build had (has on 8x) a feature that ensured we didn't have any 
> broken links between the ref guide and the javadocs...
> {code}
>  depends="javadocs,changes-to-html,process-webpages">
>  inheritall="false">
>   
>   
> 
>   
> {code}
> ...by default {{cd solr/solr-ref-guide && ant bare-bones-html-validation}} 
> just did interanal validation of the strucure of the guide, but this hook 
> ment that {{cd solr && ant documentation}} (or {{ant precommit}}) would first 
> build the javadocs; then build the ref-guide; then validate _all_ links i 
> nthe ref-guide, even those to (local) javadocs
> While the "local.javadocs" property logic _inside_ the 
> solr-ref-guide/build.xml was ported to build.gradle, the logic to leverage 
> this functionality from the "solr" project doesn't seem to have been 
> preserved -- so currently, {{gradle check}} doesn't know/care if someone adds 
> a nonsense javadoc link to the ref-guide (or removes a class/method whose 
> javadoc is already currently to from the ref guide)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-14870) gradle build does not validate ref-guide -> javadoc links

2020-10-06 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208864#comment-17208864
 ] 

Dawid Weiss edited comment on SOLR-14870 at 10/6/20, 3:56 PM:
--

Perhaps I should clarify that if the intention was to pull out common 
configuration and logic then it'd be more natural to define multiple tasks and 
just run the same configuration closure over all these tasks (that handles 
common setup). We do use it in a few places, like in 
solr-forbidden-apis.gradle, for example:
{code}
configure(project(":solr:core")) {
  tasks.matching { it.name == "forbiddenApisMain" || it.name == 
"forbiddenApisTest" }.all {
exclude "org/apache/solr/internal/**"
exclude "org/apache/hadoop/**"
  }
}
{code}

this applies the same configuration to two tasks (selected by name). You could 
just as well do something like:
{code}
configure([task1, task2]) {
  // common setup.
}
{code}
and it'd work too.


was (Author: dweiss):
Perhaps I should clarify that if the intention was to pull out common 
configuration and logic then it'd be more natural to define multiple tasks and 
just run the same configuration closure over all these tasks (that handles 
common setup). We do use it in a few places, like in 
solr-forbidden-apis.gradle, for example:
{code}
configure(project(":solr:core")) {
  tasks.matching { it.name == "forbiddenApisMain" || it.name == 
"forbiddenApisTest" }.all {
exclude "org/apache/solr/internal/**"
exclude "org/apache/hadoop/**"
  }
}
{code}

this applies the same configuration to two tasks (selected by name). You could 
just as well do something like:

configure([task1, task2]) {
  // common setup.
}

and it'd work too.

> gradle build does not validate ref-guide -> javadoc links
> -
>
> Key: SOLR-14870
> URL: https://issues.apache.org/jira/browse/SOLR-14870
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-14870.patch
>
>
> the ant build had (has on 8x) a feature that ensured we didn't have any 
> broken links between the ref guide and the javadocs...
> {code}
>  depends="javadocs,changes-to-html,process-webpages">
>  inheritall="false">
>   
>   
> 
>   
> {code}
> ...by default {{cd solr/solr-ref-guide && ant bare-bones-html-validation}} 
> just did interanal validation of the strucure of the guide, but this hook 
> ment that {{cd solr && ant documentation}} (or {{ant precommit}}) would first 
> build the javadocs; then build the ref-guide; then validate _all_ links i 
> nthe ref-guide, even those to (local) javadocs
> While the "local.javadocs" property logic _inside_ the 
> solr-ref-guide/build.xml was ported to build.gradle, the logic to leverage 
> this functionality from the "solr" project doesn't seem to have been 
> preserved -- so currently, {{gradle check}} doesn't know/care if someone adds 
> a nonsense javadoc link to the ref-guide (or removes a class/method whose 
> javadoc is already currently to from the ref guide)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-14066) Deprecate DIH and migrate to a community supported package

2020-10-06 Thread Rui Pimentel (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208861#comment-17208861
 ] 

Rui Pimentel edited comment on SOLR-14066 at 10/6/20, 4:03 PM:
---

Hi Erich Siffert/Ishan Chattopadhyaya

Did you try with Oracle ?


was (Author: informat...@spautores.pt):
Hi Erich Siffert/Ishan Chattopadhyaya

Did you try with Oracle

> Deprecate DIH and migrate to a community supported package
> --
>
> Key: SOLR-14066
> URL: https://issues.apache.org/jira/browse/SOLR-14066
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Blocker
> Fix For: 8.6
>
> Attachments: image-2019-12-14-19-58-39-314.png
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> DIH doesn't need to remain inside Solr anymore. Plan is to deprecate DIH in 
> 8.6, remove from 9.0. A community supported version of DIH (which can be used 
> with Solr's package manager) can be found here 
> https://github.com/rohitbemax/dataimporthandler.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-14066) Deprecate DIH and migrate to a community supported package

2020-10-06 Thread Rui Pimentel (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208861#comment-17208861
 ] 

Rui Pimentel edited comment on SOLR-14066 at 10/6/20, 4:03 PM:
---

Hi Erich Siffert/Ishan Chattopadhyaya

Did you try with Oracle


was (Author: informat...@spautores.pt):
Hi Erich Siffert,

Did you try with Oracle

> Deprecate DIH and migrate to a community supported package
> --
>
> Key: SOLR-14066
> URL: https://issues.apache.org/jira/browse/SOLR-14066
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Blocker
> Fix For: 8.6
>
> Attachments: image-2019-12-14-19-58-39-314.png
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> DIH doesn't need to remain inside Solr anymore. Plan is to deprecate DIH in 
> 8.6, remove from 9.0. A community supported version of DIH (which can be used 
> with Solr's package manager) can be found here 
> https://github.com/rohitbemax/dataimporthandler.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on a change in pull request #1952: LUCENE-9565 Fix competitive iteration

2020-10-06 Thread GitBox



jpountz commented on a change in pull request #1952:
URL: https://github.com/apache/lucene-solr/pull/1952#discussion_r500453140



##
File path: lucene/core/src/java/org/apache/lucene/search/Weight.java
##
@@ -204,9 +204,14 @@ public int score(LeafCollector collector, Bits acceptDocs, 
int min, int max) thr
   collector.setScorer(scorer);
   DocIdSetIterator scorerIterator = twoPhase == null ? iterator : 
twoPhase.approximation();
   DocIdSetIterator collectorIterator = collector.competitiveIterator();
-  // if possible filter scorerIterator to keep only competitive docs as 
defined by collector
-  DocIdSetIterator filteredIterator = collectorIterator == null ? 
scorerIterator :
-  ConjunctionDISI.intersectIterators(Arrays.asList(scorerIterator, 
collectorIterator));
+  DocIdSetIterator filteredIterator = scorerIterator;
+  if (collectorIterator != null) {
+if (scorerIterator.docID() != -1) {
+  collectorIterator.advance(scorerIterator.docID());
+}

Review comment:
   oh, I had not considered setting `scorerIterator.docID()` as a min 
docID, maybe this means that we no longer need the `min` parameter of 
`RangeDISIWrapper`?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mayya-sharipova commented on a change in pull request #1952: LUCENE-9565 Fix competitive iteration

2020-10-06 Thread GitBox



mayya-sharipova commented on a change in pull request #1952:
URL: https://github.com/apache/lucene-solr/pull/1952#discussion_r500461526



##
File path: lucene/core/src/java/org/apache/lucene/search/Weight.java
##
@@ -204,9 +204,14 @@ public int score(LeafCollector collector, Bits acceptDocs, 
int min, int max) thr
   collector.setScorer(scorer);
   DocIdSetIterator scorerIterator = twoPhase == null ? iterator : 
twoPhase.approximation();
   DocIdSetIterator collectorIterator = collector.competitiveIterator();
-  // if possible filter scorerIterator to keep only competitive docs as 
defined by collector
-  DocIdSetIterator filteredIterator = collectorIterator == null ? 
scorerIterator :
-  ConjunctionDISI.intersectIterators(Arrays.asList(scorerIterator, 
collectorIterator));
+  DocIdSetIterator filteredIterator = scorerIterator;
+  if (collectorIterator != null) {
+if (scorerIterator.docID() != -1) {
+  collectorIterator.advance(scorerIterator.docID());
+}

Review comment:
   Thanks @jpountz , addressed in d42c4649c81364f13c51c0b147dd600e57d7





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mayya-sharipova merged pull request #1952: LUCENE-9565 Fix competitive iteration

2020-10-06 Thread GitBox



mayya-sharipova merged pull request #1952:
URL: https://github.com/apache/lucene-solr/pull/1952


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9565) Fix iteration over competitive iterators

2020-10-06 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208971#comment-17208971
 ] 

ASF subversion and git services commented on LUCENE-9565:
-

Commit 874c446ab945aded465d12f01085af89b83563c6 in lucene-solr's branch 
refs/heads/master from Mayya Sharipova
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=874c446 ]

LUCENE-9565 Fix competitive iteration (#1952)

PR #1351 introduced a sort optimization where documents can be skipped.
But iteration over competitive iterators was not properly organized,
as they were not storing the current docID, and
when competitive iterator was updated the current doc ID was lost.

This patch fixed it.

Relates to #1351

> Fix iteration over competitive iterators
> 
>
> Key: LUCENE-9565
> URL: https://issues.apache.org/jira/browse/LUCENE-9565
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Mayya Sharipova
>Priority: Minor
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> LUCENE-9280 introduced a sort optimization where documents can be skipped.
> But iteration over competitive iterators was not properly organized,
> as they were not storing the current docID, and
> when competitive iterator was updated the current doc ID was lost.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9541) Ensure sub-iterators of ConjunctionDISI are on the same document

2020-10-06 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208974#comment-17208974
 ] 

ASF subversion and git services commented on LUCENE-9541:
-

Commit 6b8288445f96e3cb5715e0f106ea2202dab57561 in lucene-solr's branch 
refs/heads/master from Mayya Sharipova
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=6b82884 ]

LUCENE-9541 ConjunctionDISI sub-iterators check (#1937)

* LUCENE-9541 ConjunctionDISI sub-iterators check

Ensure sub-iterators of a conjunction iterator are on the same doc.

> Ensure sub-iterators of ConjunctionDISI are on the same document
> 
>
> Key: LUCENE-9541
> URL: https://issues.apache.org/jira/browse/LUCENE-9541
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Mayya Sharipova
>Priority: Minor
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Not completely sure if this is a bug.
> BitSetConjuctionDISI advances based on its lead  – DocIdSetIterator iterator, 
> and doesn't consider that its another component – BitSetIterator may have 
> already advanced passed a certain doc. This may result in duplicate documents.
> For example if BitSetConjuctionDISI  _disi_ is composed of DocIdSetIterator 
> _a_ of docs  [0,1] and BitSetIterator _b_ of docs [0,1].  Doing `b.nextDoc()` 
> we are collecting doc0,  doing `disi.nextDoc` we again  collecting the same 
> doc0.
> It seems that other conjunction iterators don't have this behaviour, if we 
> are advancing any of their component pass a certain document, the whole 
> conjunction iterator will also be advanced pass this document. 
>  
> This behaviour was exposed in this 
> [PR|https://github.com/apache/lucene-solr/pull/1903]. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9541) Ensure sub-iterators of ConjunctionDISI are on the same document

2020-10-06 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208973#comment-17208973
 ] 

ASF subversion and git services commented on LUCENE-9541:
-

Commit 6b8288445f96e3cb5715e0f106ea2202dab57561 in lucene-solr's branch 
refs/heads/master from Mayya Sharipova
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=6b82884 ]

LUCENE-9541 ConjunctionDISI sub-iterators check (#1937)

* LUCENE-9541 ConjunctionDISI sub-iterators check

Ensure sub-iterators of a conjunction iterator are on the same doc.

> Ensure sub-iterators of ConjunctionDISI are on the same document
> 
>
> Key: LUCENE-9541
> URL: https://issues.apache.org/jira/browse/LUCENE-9541
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Mayya Sharipova
>Priority: Minor
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Not completely sure if this is a bug.
> BitSetConjuctionDISI advances based on its lead  – DocIdSetIterator iterator, 
> and doesn't consider that its another component – BitSetIterator may have 
> already advanced passed a certain doc. This may result in duplicate documents.
> For example if BitSetConjuctionDISI  _disi_ is composed of DocIdSetIterator 
> _a_ of docs  [0,1] and BitSetIterator _b_ of docs [0,1].  Doing `b.nextDoc()` 
> we are collecting doc0,  doing `disi.nextDoc` we again  collecting the same 
> doc0.
> It seems that other conjunction iterators don't have this behaviour, if we 
> are advancing any of their component pass a certain document, the whole 
> conjunction iterator will also be advanced pass this document. 
>  
> This behaviour was exposed in this 
> [PR|https://github.com/apache/lucene-solr/pull/1903]. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9565) Fix iteration over competitive iterators

2020-10-06 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208979#comment-17208979
 ] 

ASF subversion and git services commented on LUCENE-9565:
-

Commit 16d25ace3f4c273d45284c0e7ebf9915b245995a in lucene-solr's branch 
refs/heads/branch_8x from Mayya Sharipova
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=16d25ac ]

LUCENE-9565 Fix competitive iteration (#1952)

PR #1351 introduced a sort optimization where documents can be skipped.
But iteration over competitive iterators was not properly organized,
as they were not storing the current docID, and
when competitive iterator was updated the current doc ID was lost.

This patch fixed it.

Relates to #1351

> Fix iteration over competitive iterators
> 
>
> Key: LUCENE-9565
> URL: https://issues.apache.org/jira/browse/LUCENE-9565
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Mayya Sharipova
>Priority: Minor
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> LUCENE-9280 introduced a sort optimization where documents can be skipped.
> But iteration over competitive iterators was not properly organized,
> as they were not storing the current docID, and
> when competitive iterator was updated the current doc ID was lost.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9541) Ensure sub-iterators of ConjunctionDISI are on the same document

2020-10-06 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208980#comment-17208980
 ] 

ASF subversion and git services commented on LUCENE-9541:
-

Commit 66c49a354023a6e77b67839cc59a1e499fcd6536 in lucene-solr's branch 
refs/heads/branch_8x from Mayya Sharipova
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=66c49a3 ]

LUCENE-9541 ConjunctionDISI sub-iterators check (#1937)

Ensure sub-iterators of a conjunction iterator are on the same doc.


> Ensure sub-iterators of ConjunctionDISI are on the same document
> 
>
> Key: LUCENE-9541
> URL: https://issues.apache.org/jira/browse/LUCENE-9541
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Mayya Sharipova
>Priority: Minor
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Not completely sure if this is a bug.
> BitSetConjuctionDISI advances based on its lead  – DocIdSetIterator iterator, 
> and doesn't consider that its another component – BitSetIterator may have 
> already advanced passed a certain doc. This may result in duplicate documents.
> For example if BitSetConjuctionDISI  _disi_ is composed of DocIdSetIterator 
> _a_ of docs  [0,1] and BitSetIterator _b_ of docs [0,1].  Doing `b.nextDoc()` 
> we are collecting doc0,  doing `disi.nextDoc` we again  collecting the same 
> doc0.
> It seems that other conjunction iterators don't have this behaviour, if we 
> are advancing any of their component pass a certain document, the whole 
> conjunction iterator will also be advanced pass this document. 
>  
> This behaviour was exposed in this 
> [PR|https://github.com/apache/lucene-solr/pull/1903]. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9564) Format code automatically and enforce it

2020-10-06 Thread Robert Muir (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209000#comment-17209000
 ] 

Robert Muir commented on LUCENE-9564:
-

https://github.com/psf/black#migrating-your-code-style-without-ruining-git-blame

> Format code automatically and enforce it
> 
>
> Key: LUCENE-9564
> URL: https://issues.apache.org/jira/browse/LUCENE-9564
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is a trivial change but a bold move. And I'm sure it's not for everyone.
> I started using google java format [1] in my projects a while ago and have 
> never looked back since. It is an oracle-style formatter (doesn't allow 
> customizations or deviations from the defined 'ideal') - this takes some 
> getting used to - but it also eliminates *all* the potential differences 
> between IDEs, configs, etc.  And the formatted code typically looks much 
> better than hand-edited one. It is also verifiable on precommit (so you can't 
> commit code that deviates from what you'd get from automated formatting 
> output).
> The biggest benefit I see is that refactorings become such a joy and keep the 
> code neat, everywhere. Before you commit you just reformat everything 
> automatically, no matter how much you messed it up.
> This isn't a change for everyone. I myself love hand-edited, neat code... but 
> the reality is that with IDE support for automated code changes and so many 
> people with different styles working on the same codebase keeping it neat is 
> a big pain. 
> Checkstyle and other tools are fine for ensuring certain rules but they don't 
> take the burden of formatting off your shoulders. This tool does. 
> Like I said - I had *great* reservations about using it at the beginning but 
> over time got so used to it that I almost can't live without it now. It's 
> like magic - you play with the code in any way you like, then run formatting 
> and it's nice and neat.
> The downside is that automated formatting does imply potential merge problems 
> in backward patches (or any currently existing branches).
> Like I said, it is a bold move. Just throwing this for your consideration.
> -I've added a PR that adds spotless but it's not ready; some files would have 
> to be excluded as they currently violate header rules.-
> A more interesting thing is here where the current code is automatically 
> reformatted - this branch is for eyeballing only.
> https://github.com/dweiss/lucene-solr/compare/LUCENE-9564...dweiss:LUCENE-9564-example
> [1] https://google.github.io/styleguide/javaguide.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] sigram commented on pull request #1951: SOLR-14691: Reduce object creation by using MapWriter / IteratorWriter.

2020-10-06 Thread GitBox



sigram commented on pull request #1951:
URL: https://github.com/apache/lucene-solr/pull/1951#issuecomment-704446815


   > Can we get rid of the interface PropertyFilter and replace it with a 
Predicate?
   Again, we're limited by back-compat in 8x. We could add another set of 
methods in `MetricUtils` that use `Predicate` and deprecate 
`PropertyFilter` ... but that would double the number of methods.
   
   > MetricUtils.addMetrics()
   Not sure what you're referring to - this method doesn't create a NamedList. 
It's used only in `OverseerStatusCmd`, which creates a NamedList anyway for 
other stuff.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] sigram edited a comment on pull request #1951: SOLR-14691: Reduce object creation by using MapWriter / IteratorWriter.

2020-10-06 Thread GitBox



sigram edited a comment on pull request #1951:
URL: https://github.com/apache/lucene-solr/pull/1951#issuecomment-704446815


   > Can we get rid of the interface PropertyFilter and replace it with a 
Predicate?
   
   Again, we're limited by back-compat in 8x. We could add another set of 
methods in `MetricUtils` that use `Predicate` and deprecate 
`PropertyFilter` ... but that would double the number of methods.
   
   > MetricUtils.addMetrics()
   
   Not sure what you're referring to - this method doesn't create a NamedList. 
It's used only in `OverseerStatusCmd`, which creates a NamedList anyway for 
other stuff.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-14915) Prometheus-exporter should not depend on Solr-core

2020-10-06 Thread David Smiley (Jira)

David Smiley created SOLR-14915:
---

 Summary: Prometheus-exporter should not depend on Solr-core
 Key: SOLR-14915
 URL: https://issues.apache.org/jira/browse/SOLR-14915
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
  Components: contrib - prometheus-exporter
Reporter: David Smiley
Assignee: David Smiley


I think it's *crazy* that our Prometheus exporter depends on Solr-core -- this 
thing is a _client_ of Solr; it does not live within Solr.  The exporter ought 
to be fairly lean.  One consequence of this dependency is that, for example, 
security vulnerabilities reported against Solr (e.g. Jetty) can (and do, where 
I work) wind up being reported against this module even though Prometheus isn't 
using Jetty.

>From my evaluation today of what's going on, it appears the crux of the 
>problem is that the prometheus exporter uses some utility mechanisms in 
>Solr-core like XmlConfig (which depends on SolrResourceLoader and the rabbit 
>hole goes deeper...) and DOMUtils (further depends on PropertiesUtil).  It can 
>easy be made to not use XmlConfig.  DOMUtils & PropertiesUtil could move to 
>SolrJ which already has lots of little dependency-free utilities needed by 
>SolrJ and Solr-core alike.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9564) Format code automatically and enforce it

2020-10-06 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209025#comment-17209025
 ] 

Dawid Weiss commented on LUCENE-9564:
-

Bull's eye! Thanks Robert. Interesting tool too - similar motivation to the one 
I tried to outline.

> Format code automatically and enforce it
> 
>
> Key: LUCENE-9564
> URL: https://issues.apache.org/jira/browse/LUCENE-9564
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is a trivial change but a bold move. And I'm sure it's not for everyone.
> I started using google java format [1] in my projects a while ago and have 
> never looked back since. It is an oracle-style formatter (doesn't allow 
> customizations or deviations from the defined 'ideal') - this takes some 
> getting used to - but it also eliminates *all* the potential differences 
> between IDEs, configs, etc.  And the formatted code typically looks much 
> better than hand-edited one. It is also verifiable on precommit (so you can't 
> commit code that deviates from what you'd get from automated formatting 
> output).
> The biggest benefit I see is that refactorings become such a joy and keep the 
> code neat, everywhere. Before you commit you just reformat everything 
> automatically, no matter how much you messed it up.
> This isn't a change for everyone. I myself love hand-edited, neat code... but 
> the reality is that with IDE support for automated code changes and so many 
> people with different styles working on the same codebase keeping it neat is 
> a big pain. 
> Checkstyle and other tools are fine for ensuring certain rules but they don't 
> take the burden of formatting off your shoulders. This tool does. 
> Like I said - I had *great* reservations about using it at the beginning but 
> over time got so used to it that I almost can't live without it now. It's 
> like magic - you play with the code in any way you like, then run formatting 
> and it's nice and neat.
> The downside is that automated formatting does imply potential merge problems 
> in backward patches (or any currently existing branches).
> Like I said, it is a bold move. Just throwing this for your consideration.
> -I've added a PR that adds spotless but it's not ready; some files would have 
> to be excluded as they currently violate header rules.-
> A more interesting thing is here where the current code is automatically 
> reformatted - this branch is for eyeballing only.
> https://github.com/dweiss/lucene-solr/compare/LUCENE-9564...dweiss:LUCENE-9564-example
> [1] https://google.github.io/styleguide/javaguide.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13392) Unable to start prometheus-exporter in 7x branch

2020-10-06 Thread David Smiley (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209031#comment-17209031
 ] 

David Smiley commented on SOLR-13392:
-

bq. Do we have a test that tries to launch the exporter to avoid such bugs in 
the future?

Apparently not!  Docker is perfect for such a test, and we finally have Docker 
in our source tree but not yet Jenkins setup (whatever that may be).

BTW I just filed SOLR-14915 to change the exporter to not depend on solr-core, 
which I think is crazy.

> Unable to start prometheus-exporter in 7x branch
> 
>
> Key: SOLR-13392
> URL: https://issues.apache.org/jira/browse/SOLR-13392
> Project: Solr
>  Issue Type: Bug
>  Components: metrics
>Affects Versions: 7.7.2
>Reporter: Karl Stoney
>Assignee: Shalin Shekhar Mangar
>Priority: Major
> Fix For: 7.7.2, 8.1, master (9.0)
>
> Attachments: SOLR-13392.patch
>
>
> Hi, 
> prometheus-exporter doesn't start in branch 7x on commit 
> 7dfe1c093b65f77407c2df4c2a1120a213aef166, it does work on 
> 26b498d0a9d25626a15e25b0cf97c8339114263a so something has changed between 
> those two commits causing this.
> I am presuming it is 
> https://github.com/apache/lucene-solr/commit/e1eeafb5dc077976646b06f4cba4d77534963fa9#diff-3f7b27f0f087632739effa2aa508d77eR34
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/lucene/util/IOUtils
> at 
> org.apache.solr.core.SolrResourceLoader.close(SolrResourceLoader.java:881)
> at 
> org.apache.solr.prometheus.exporter.SolrExporter.loadMetricsConfiguration(SolrExporter.java:221)
> at 
> org.apache.solr.prometheus.exporter.SolrExporter.main(SolrExporter.java:205)
> Caused by: java.lang.ClassNotFoundException: org.apache.lucene.util.IOUtils
> at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> ... 3 more



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-14916) Add split parameter to timeseries Streaming Expression

2020-10-06 Thread Joel Bernstein (Jira)

Joel Bernstein created SOLR-14916:
-

 Summary: Add split parameter to timeseries Streaming Expression
 Key: SOLR-14916
 URL: https://issues.apache.org/jira/browse/SOLR-14916
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Joel Bernstein


Currently the time series function only supports the time aggregations across 
the dimension. This ticket will add the split parameter which will add a top 
level split by categorical field, to produce time series reports per each 
split. The split-limit and split-sort parameters will also be added to control 
number and order of values in split field. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14916) Add split parameter to timeseries Streaming Expression

2020-10-06 Thread Joel Bernstein (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-14916:
--
Description: 
Currently the time series function only supports aggregations across the time 
dimension. This ticket will add the *split* parameter which will add a top 
level split by categorical field, to produce time series reports per each 
split. The split-rows and split-sort parameters will also be added to control 
number and order of values in split field. 

Sample syntax:
{code}timeseries(collection1, q="*:*", split="company", split-rows=10, 
split-sort="avg(price_f) desc",  field="timefield", gap="+1DAY", 
format="-dd-MM" ){code}



  was:
Currently the time series function only supports aggregations across the time 
dimension. This ticket will add the *split* parameter which will add a top 
level split by categorical field, to produce time series reports per each 
split. The split-rows and split-sort parameters will also be added to control 
number and order of values in split field. 

Sample syntax:
{code}timeserie(collection1, q="*:*", split="company", split-rows=10, 
split-sort="avg(price_f) desc",  field="timefield", gap="+1DAY", 
format="-dd-MM" ){code}




> Add split parameter to timeseries Streaming Expression
> --
>
> Key: SOLR-14916
> URL: https://issues.apache.org/jira/browse/SOLR-14916
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> Currently the time series function only supports aggregations across the time 
> dimension. This ticket will add the *split* parameter which will add a top 
> level split by categorical field, to produce time series reports per each 
> split. The split-rows and split-sort parameters will also be added to control 
> number and order of values in split field. 
> Sample syntax:
> {code}timeseries(collection1, q="*:*", split="company", split-rows=10, 
> split-sort="avg(price_f) desc",  field="timefield", gap="+1DAY", 
> format="-dd-MM" ){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14916) Add split parameter to timeseries Streaming Expression

2020-10-06 Thread Joel Bernstein (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-14916:
--
Description: 
Currently the time series function only supports aggregations across the time 
dimension. This ticket will add the *split* parameter which will add a top 
level split by categorical field, to produce time series reports per each 
split. The split-rows and split-sort parameters will also be added to control 
number and order of values in split field. 

Sample syntax:
{code}timeserie(collection1, q="*:*", split="company", split-rows=10, 
split-sort="avg(price_f) desc",  field="timefield", gap="+1DAY", 
format="-dd-MM" ){code}



  was:Currently the time series function only supports the time aggregations 
across the dimension. This ticket will add the split parameter which will add a 
top level split by categorical field, to produce time series reports per each 
split. The split-limit and split-sort parameters will also be added to control 
number and order of values in split field. 


> Add split parameter to timeseries Streaming Expression
> --
>
> Key: SOLR-14916
> URL: https://issues.apache.org/jira/browse/SOLR-14916
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> Currently the time series function only supports aggregations across the time 
> dimension. This ticket will add the *split* parameter which will add a top 
> level split by categorical field, to produce time series reports per each 
> split. The split-rows and split-sort parameters will also be added to control 
> number and order of values in split field. 
> Sample syntax:
> {code}timeserie(collection1, q="*:*", split="company", split-rows=10, 
> split-sort="avg(price_f) desc",  field="timefield", gap="+1DAY", 
> format="-dd-MM" ){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14916) Add split parameter to timeseries Streaming Expression

2020-10-06 Thread Joel Bernstein (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-14916:
--
Description: 
Currently the time series function only supports aggregations across the time 
dimension. This ticket will add the *split* parameter which will add a top 
level split by categorical field, to produce time series reports per each 
split. The split-rows and split-sort parameters will also be added to control 
number and order of values in split field. 

Sample syntax:
{code}
timeseries(collection1, 
  q="*:*", 
   split="company", 
   split-rows=10, 
   split-sort="avg(price_f) desc",  
   field="timefield", 
   gap="+1DAY", 
   format="-dd-MM" ,
   avg(price_f))
{code}



  was:
Currently the time series function only supports aggregations across the time 
dimension. This ticket will add the *split* parameter which will add a top 
level split by categorical field, to produce time series reports per each 
split. The split-rows and split-sort parameters will also be added to control 
number and order of values in split field. 

Sample syntax:
{code}
timeseries(collection1, 
   q="*:*", 
   split="company", 
   split-rows=10, 
   split-sort="avg(price_f) desc",  
   field="timefield", 
   gap="+1DAY", 
   format="-dd-MM" ,
   avg(price_f))
{code}




> Add split parameter to timeseries Streaming Expression
> --
>
> Key: SOLR-14916
> URL: https://issues.apache.org/jira/browse/SOLR-14916
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> Currently the time series function only supports aggregations across the time 
> dimension. This ticket will add the *split* parameter which will add a top 
> level split by categorical field, to produce time series reports per each 
> split. The split-rows and split-sort parameters will also be added to control 
> number and order of values in split field. 
> Sample syntax:
> {code}
> timeseries(collection1, 
>   q="*:*", 
>split="company", 
>split-rows=10, 
>split-sort="avg(price_f) desc",  
>field="timefield", 
>gap="+1DAY", 
>format="-dd-MM" ,
>avg(price_f))
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14916) Add split parameter to timeseries Streaming Expression

2020-10-06 Thread Joel Bernstein (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-14916:
--
Description: 
Currently the time series function only supports aggregations across the time 
dimension. This ticket will add the *split* parameter which will add a top 
level split by categorical field, to produce time series reports per each 
split. The split-rows and split-sort parameters will also be added to control 
number and order of values in split field. 

Sample syntax:
{code}
timeseries(collection1, 
   q="*:*", 
   split="company", 
   split-rows=10, 
   split-sort="avg(price_f) desc",  
   field="timefield", 
   gap="+1DAY", 
   format="-dd-MM" ,
   avg(price_f))
{code}



  was:
Currently the time series function only supports aggregations across the time 
dimension. This ticket will add the *split* parameter which will add a top 
level split by categorical field, to produce time series reports per each 
split. The split-rows and split-sort parameters will also be added to control 
number and order of values in split field. 

Sample syntax:
{code}timeseries(collection1, q="*:*", split="company", split-rows=10, 
split-sort="avg(price_f) desc",  field="timefield", gap="+1DAY", 
format="-dd-MM" ){code}




> Add split parameter to timeseries Streaming Expression
> --
>
> Key: SOLR-14916
> URL: https://issues.apache.org/jira/browse/SOLR-14916
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> Currently the time series function only supports aggregations across the time 
> dimension. This ticket will add the *split* parameter which will add a top 
> level split by categorical field, to produce time series reports per each 
> split. The split-rows and split-sort parameters will also be added to control 
> number and order of values in split field. 
> Sample syntax:
> {code}
> timeseries(collection1, 
>q="*:*", 
>split="company", 
>split-rows=10, 
>split-sort="avg(price_f) desc",  
>field="timefield", 
>gap="+1DAY", 
>format="-dd-MM" ,
>avg(price_f))
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14916) Add split parameter to timeseries Streaming Expression

2020-10-06 Thread Joel Bernstein (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-14916:
--
Description: 
Currently the time series function only supports aggregations across the time 
dimension. This ticket will add the *split* parameter which will add a top 
level split by categorical field, to produce time series reports per each 
split. The split-rows and split-sort parameters will also be added to control 
number and order of values in split field. 

Sample syntax:
{code}
timeseries(collection1, 
   q="*:*", 
   split="company", 
   split-rows=10, 
   split-sort="avg(price_f) desc",  
   field="timefield", 
   gap="+1DAY", 
   format="-dd-MM" ,
   avg(price_f))
{code}



  was:
Currently the time series function only supports aggregations across the time 
dimension. This ticket will add the *split* parameter which will add a top 
level split by categorical field, to produce time series reports per each 
split. The split-rows and split-sort parameters will also be added to control 
number and order of values in split field. 

Sample syntax:
{code}
timeseries(collection1, 
  q="*:*", 
   split="company", 
   split-rows=10, 
   split-sort="avg(price_f) desc",  
   field="timefield", 
   gap="+1DAY", 
   format="-dd-MM" ,
   avg(price_f))
{code}




> Add split parameter to timeseries Streaming Expression
> --
>
> Key: SOLR-14916
> URL: https://issues.apache.org/jira/browse/SOLR-14916
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> Currently the time series function only supports aggregations across the time 
> dimension. This ticket will add the *split* parameter which will add a top 
> level split by categorical field, to produce time series reports per each 
> split. The split-rows and split-sort parameters will also be added to control 
> number and order of values in split field. 
> Sample syntax:
> {code}
> timeseries(collection1, 
>q="*:*", 
>split="company", 
>split-rows=10, 
>split-sort="avg(price_f) desc",  
>field="timefield", 
>gap="+1DAY", 
>format="-dd-MM" ,
>avg(price_f))
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14916) Add split parameter to timeseries Streaming Expression

2020-10-06 Thread Joel Bernstein (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-14916:
--
Description: 
Currently the time series function only supports aggregations across the time 
dimension. This ticket will add the *split* parameter which will add a top 
level split by categorical field, to produce time series reports per each 
split. The split-rows and split-sort parameters will also be added to control 
number and order of values in split field. 

Sample syntax:
{code}
timeseries(collection1, 
   q="*:*", 
   split="company", 
   split-rows=10, 
   split-sort="avg(price_f) desc",  
   field="timefield", 
   gap="+1DAY", 
   format="-dd-MM" ,
   avg(price_f))
{code}



  was:
Currently the time series function only supports aggregations across the time 
dimension. This ticket will add the *split* parameter which will add a top 
level split by categorical field, to produce time series reports per each 
split. The split-rows and split-sort parameters will also be added to control 
number and order of values in split field. 

Sample syntax:
{code}
timeseries(collection1, 
   q="*:*", 
   split="company", 
   split-rows=10, 
   split-sort="avg(price_f) desc",  
   field="timefield", 
   gap="+1DAY", 
   format="-dd-MM" ,
   avg(price_f))
{code}




> Add split parameter to timeseries Streaming Expression
> --
>
> Key: SOLR-14916
> URL: https://issues.apache.org/jira/browse/SOLR-14916
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> Currently the time series function only supports aggregations across the time 
> dimension. This ticket will add the *split* parameter which will add a top 
> level split by categorical field, to produce time series reports per each 
> split. The split-rows and split-sort parameters will also be added to control 
> number and order of values in split field. 
> Sample syntax:
> {code}
> timeseries(collection1, 
>q="*:*", 
>split="company", 
>split-rows=10, 
>split-sort="avg(price_f) desc",  
>field="timefield", 
>gap="+1DAY", 
>format="-dd-MM" ,
>avg(price_f))
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14916) Add split parameter to timeseries Streaming Expression

2020-10-06 Thread Joel Bernstein (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-14916:
--
Description: 
Currently the time series function only supports aggregations across the time 
dimension. This ticket will add the *split* parameter which will add a top 
level split by categorical field, to produce time series reports per each 
split. The split-rows and split-sort parameters will also be added to control 
number and order of values in split field. 

Sample syntax:
{code}
timeseries(collection1, 
   q="*:*", 
   split="company", 
   split-rows=10, 
   split-sort="avg(price_f) desc",  
   field="timefield", 
   gap="+1DAY", 
   format="-dd-MM" ,
   avg(price_f))
{code}

The output of this can be easily pivoted into a matrix and correlated or 
clustered like the output of the *facet2D* function. 






  was:
Currently the time series function only supports aggregations across the time 
dimension. This ticket will add the *split* parameter which will add a top 
level split by categorical field, to produce time series reports per each 
split. The split-rows and split-sort parameters will also be added to control 
number and order of values in split field. 

Sample syntax:
{code}
timeseries(collection1, 
   q="*:*", 
   split="company", 
   split-rows=10, 
   split-sort="avg(price_f) desc",  
   field="timefield", 
   gap="+1DAY", 
   format="-dd-MM" ,
   avg(price_f))
{code}




> Add split parameter to timeseries Streaming Expression
> --
>
> Key: SOLR-14916
> URL: https://issues.apache.org/jira/browse/SOLR-14916
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> Currently the time series function only supports aggregations across the time 
> dimension. This ticket will add the *split* parameter which will add a top 
> level split by categorical field, to produce time series reports per each 
> split. The split-rows and split-sort parameters will also be added to control 
> number and order of values in split field. 
> Sample syntax:
> {code}
> timeseries(collection1, 
>q="*:*", 
>split="company", 
>split-rows=10, 
>split-sort="avg(price_f) desc",  
>field="timefield", 
>gap="+1DAY", 
>format="-dd-MM" ,
>avg(price_f))
> {code}
> The output of this can be easily pivoted into a matrix and correlated or 
> clustered like the output of the *facet2D* function. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14916) Add split parameter to timeseries Streaming Expression

2020-10-06 Thread Joel Bernstein (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-14916:
--
Description: 
Currently the time series function only supports aggregations across the time 
dimension. This ticket will add the *split* parameter which will add a top 
level split by categorical field, to produce time lines per each split. The 
split-rows and split-sort parameters will also be added to control number and 
order of values in split field. 

Sample syntax:
{code}
timeseries(collection1, 
   q="*:*", 
   split="company", 
   split-rows=10, 
   split-sort="avg(price_f) desc",  
   field="timefield", 
   gap="+1DAY", 
   format="-dd-MM" ,
   avg(price_f))
{code}

The output of this can be easily pivoted into a matrix and correlated or 
clustered like the output of the *facet2D* function. 






  was:
Currently the time series function only supports aggregations across the time 
dimension. This ticket will add the *split* parameter which will add a top 
level split by categorical field, to produce time series reports per each 
split. The split-rows and split-sort parameters will also be added to control 
number and order of values in split field. 

Sample syntax:
{code}
timeseries(collection1, 
   q="*:*", 
   split="company", 
   split-rows=10, 
   split-sort="avg(price_f) desc",  
   field="timefield", 
   gap="+1DAY", 
   format="-dd-MM" ,
   avg(price_f))
{code}

The output of this can be easily pivoted into a matrix and correlated or 
clustered like the output of the *facet2D* function. 







> Add split parameter to timeseries Streaming Expression
> --
>
> Key: SOLR-14916
> URL: https://issues.apache.org/jira/browse/SOLR-14916
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> Currently the time series function only supports aggregations across the time 
> dimension. This ticket will add the *split* parameter which will add a top 
> level split by categorical field, to produce time lines per each split. The 
> split-rows and split-sort parameters will also be added to control number and 
> order of values in split field. 
> Sample syntax:
> {code}
> timeseries(collection1, 
>q="*:*", 
>split="company", 
>split-rows=10, 
>split-sort="avg(price_f) desc",  
>field="timefield", 
>gap="+1DAY", 
>format="-dd-MM" ,
>avg(price_f))
> {code}
> The output of this can be easily pivoted into a matrix and correlated or 
> clustered like the output of the *facet2D* function. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14916) Add split parameter to timeseries Streaming Expression

2020-10-06 Thread Joel Bernstein (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-14916:
--
Description: 
Currently the time series function only supports aggregations across the time 
dimension. This ticket will add the *split* parameter which will add a top 
level split by categorical field, to produce time lines per each split. The 
split-limit and split-sort parameters will also be added to control number and 
order of values in split field. 

Sample syntax:
{code}
timeseries(collection1, 
   q="*:*", 
   split="company", 
   split-limit=10, 
   split-sort="avg(price_f) desc",  
   field="timefield", 
   gap="+1DAY", 
   format="-dd-MM" ,
   avg(price_f))
{code}

The output of this can be easily pivoted into a matrix and correlated or 
clustered like the output of the *facet2D* function. 






  was:
Currently the time series function only supports aggregations across the time 
dimension. This ticket will add the *split* parameter which will add a top 
level split by categorical field, to produce time lines per each split. The 
split-rows and split-sort parameters will also be added to control number and 
order of values in split field. 

Sample syntax:
{code}
timeseries(collection1, 
   q="*:*", 
   split="company", 
   split-rows=10, 
   split-sort="avg(price_f) desc",  
   field="timefield", 
   gap="+1DAY", 
   format="-dd-MM" ,
   avg(price_f))
{code}

The output of this can be easily pivoted into a matrix and correlated or 
clustered like the output of the *facet2D* function. 







> Add split parameter to timeseries Streaming Expression
> --
>
> Key: SOLR-14916
> URL: https://issues.apache.org/jira/browse/SOLR-14916
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> Currently the time series function only supports aggregations across the time 
> dimension. This ticket will add the *split* parameter which will add a top 
> level split by categorical field, to produce time lines per each split. The 
> split-limit and split-sort parameters will also be added to control number 
> and order of values in split field. 
> Sample syntax:
> {code}
> timeseries(collection1, 
>q="*:*", 
>split="company", 
>split-limit=10, 
>split-sort="avg(price_f) desc",  
>field="timefield", 
>gap="+1DAY", 
>format="-dd-MM" ,
>avg(price_f))
> {code}
> The output of this can be easily pivoted into a matrix and correlated or 
> clustered like the output of the *facet2D* function. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14916) Add split parameter to timeseries Streaming Expression

2020-10-06 Thread Joel Bernstein (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209046#comment-17209046
 ] 

Joel Bernstein commented on SOLR-14916:
---

[~aroopganguly], how does this design look to you?

> Add split parameter to timeseries Streaming Expression
> --
>
> Key: SOLR-14916
> URL: https://issues.apache.org/jira/browse/SOLR-14916
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> Currently the time series function only supports aggregations across the time 
> dimension. This ticket will add the *split* parameter which will add a top 
> level split by categorical field, to produce time lines per each split. The 
> split-limit and split-sort parameters will also be added to control number 
> and order of values in split field. 
> Sample syntax:
> {code}
> timeseries(collection1, 
>q="*:*", 
>split="company", 
>split-limit=10, 
>split-sort="avg(price_f) desc",  
>field="timefield", 
>gap="+1DAY", 
>format="-dd-MM" ,
>avg(price_f))
> {code}
> The output of this can be easily pivoted into a matrix and correlated or 
> clustered like the output of the *facet2D* function. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-14917) Move DOMUtil and PropertiesUtil to SolrJ, o.a.s.common/util

2020-10-06 Thread David Smiley (Jira)

David Smiley created SOLR-14917:
---

 Summary: Move DOMUtil and PropertiesUtil to SolrJ, 
o.a.s.common/util
 Key: SOLR-14917
 URL: https://issues.apache.org/jira/browse/SOLR-14917
 Project: Solr
  Issue Type: Sub-task
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: David Smiley
Assignee: David Smiley


DOMUtil has some XML DOM utilities, and PropertiesUtil has property 
substitution utilities.  They are fairly isolated and can easily move from 
o.a.s.util in solr-core to o.a.s.common.util package in SolrJ.  

The Moving of such things should be 9.x, but I suppose in 8.x a deprecated 
subclass could be added in both of the former locations?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14916) Add split parameter to timeseries Streaming Expression

2020-10-06 Thread Joel Bernstein (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-14916:
--
Description: 
Currently the time series function only supports aggregations across the time 
dimension. This ticket will add the *split* parameter which will add a top 
level split by categorical field, to produce time lines per each split. The 
split-limit and split-sort parameters will also be added to control number and 
order of values in the split field result. 

Sample syntax:
{code}
timeseries(collection1, 
   q="*:*", 
   split="company", 
   split-limit=10, 
   split-sort="avg(price_f) desc",  
   field="timefield", 
   gap="+1DAY", 
   format="-dd-MM" ,
   avg(price_f))
{code}

The output of this can be easily pivoted into a matrix and correlated or 
clustered like the output of the *facet2D* function. 






  was:
Currently the time series function only supports aggregations across the time 
dimension. This ticket will add the *split* parameter which will add a top 
level split by categorical field, to produce time lines per each split. The 
split-limit and split-sort parameters will also be added to control number and 
order of values in split field. 

Sample syntax:
{code}
timeseries(collection1, 
   q="*:*", 
   split="company", 
   split-limit=10, 
   split-sort="avg(price_f) desc",  
   field="timefield", 
   gap="+1DAY", 
   format="-dd-MM" ,
   avg(price_f))
{code}

The output of this can be easily pivoted into a matrix and correlated or 
clustered like the output of the *facet2D* function. 







> Add split parameter to timeseries Streaming Expression
> --
>
> Key: SOLR-14916
> URL: https://issues.apache.org/jira/browse/SOLR-14916
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> Currently the time series function only supports aggregations across the time 
> dimension. This ticket will add the *split* parameter which will add a top 
> level split by categorical field, to produce time lines per each split. The 
> split-limit and split-sort parameters will also be added to control number 
> and order of values in the split field result. 
> Sample syntax:
> {code}
> timeseries(collection1, 
>q="*:*", 
>split="company", 
>split-limit=10, 
>split-sort="avg(price_f) desc",  
>field="timefield", 
>gap="+1DAY", 
>format="-dd-MM" ,
>avg(price_f))
> {code}
> The output of this can be easily pivoted into a matrix and correlated or 
> clustered like the output of the *facet2D* function. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14916) Add split parameter to timeseries Streaming Expression

2020-10-06 Thread Joel Bernstein (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-14916:
--
Description: 
Currently the time series function only supports aggregations across the time 
dimension. This ticket will add the *split* parameter which will add a top 
level split by categorical field, to produce time lines per each split. The 
split-limit and split-sort parameters will also be added to control number and 
order of values in the split field result. 

Sample syntax:
{code}
timeseries(collection1, 
   q="*:*", 
   split="company", 
   split-limit=10, 
   split-sort="avg(price_f) desc",  
   field="timefield", 
   gap="+1DAY", 
   format="-dd-MM" ,
   avg(price_f))
{code}

The output of this can be easily pivoted into a matrix and correlated or 
clustered like the output of the *facet2D* function.  The *diff* function 
already supports the serial differencing of matrix columns so it's very easy to 
perform cluster etc.. on this output.






  was:
Currently the time series function only supports aggregations across the time 
dimension. This ticket will add the *split* parameter which will add a top 
level split by categorical field, to produce time lines per each split. The 
split-limit and split-sort parameters will also be added to control number and 
order of values in the split field result. 

Sample syntax:
{code}
timeseries(collection1, 
   q="*:*", 
   split="company", 
   split-limit=10, 
   split-sort="avg(price_f) desc",  
   field="timefield", 
   gap="+1DAY", 
   format="-dd-MM" ,
   avg(price_f))
{code}

The output of this can be easily pivoted into a matrix and correlated or 
clustered like the output of the *facet2D* function. 







> Add split parameter to timeseries Streaming Expression
> --
>
> Key: SOLR-14916
> URL: https://issues.apache.org/jira/browse/SOLR-14916
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> Currently the time series function only supports aggregations across the time 
> dimension. This ticket will add the *split* parameter which will add a top 
> level split by categorical field, to produce time lines per each split. The 
> split-limit and split-sort parameters will also be added to control number 
> and order of values in the split field result. 
> Sample syntax:
> {code}
> timeseries(collection1, 
>q="*:*", 
>split="company", 
>split-limit=10, 
>split-sort="avg(price_f) desc",  
>field="timefield", 
>gap="+1DAY", 
>format="-dd-MM" ,
>avg(price_f))
> {code}
> The output of this can be easily pivoted into a matrix and correlated or 
> clustered like the output of the *facet2D* function.  The *diff* function 
> already supports the serial differencing of matrix columns so it's very easy 
> to perform cluster etc.. on this output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14916) Add split parameter to timeseries Streaming Expression

2020-10-06 Thread Joel Bernstein (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-14916:
--
Description: 
Currently the time series function only supports aggregations across the time 
dimension. This ticket will add the *split* parameter which will add a top 
level split by categorical field, to produce time lines per each split. The 
split-limit and split-sort parameters will also be added to control number and 
order of values in the split field result. 

Sample syntax:
{code}
timeseries(collection1, 
   q="*:*", 
   split="company", 
   split-limit=10, 
   split-sort="avg(price_f) desc",  
   field="timefield", 
   gap="+1DAY", 
   format="-dd-MM" ,
   avg(price_f))
{code}

The output of this can be easily pivoted into a matrix and correlated or 
clustered like the output of the *facet2D* function.  The *diff* function 
already supports the serial differencing of matrix columns so it's very easy to 
perform clustering etc.. on this output.






  was:
Currently the time series function only supports aggregations across the time 
dimension. This ticket will add the *split* parameter which will add a top 
level split by categorical field, to produce time lines per each split. The 
split-limit and split-sort parameters will also be added to control number and 
order of values in the split field result. 

Sample syntax:
{code}
timeseries(collection1, 
   q="*:*", 
   split="company", 
   split-limit=10, 
   split-sort="avg(price_f) desc",  
   field="timefield", 
   gap="+1DAY", 
   format="-dd-MM" ,
   avg(price_f))
{code}

The output of this can be easily pivoted into a matrix and correlated or 
clustered like the output of the *facet2D* function.  The *diff* function 
already supports the serial differencing of matrix columns so it's very easy to 
perform cluster etc.. on this output.







> Add split parameter to timeseries Streaming Expression
> --
>
> Key: SOLR-14916
> URL: https://issues.apache.org/jira/browse/SOLR-14916
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> Currently the time series function only supports aggregations across the time 
> dimension. This ticket will add the *split* parameter which will add a top 
> level split by categorical field, to produce time lines per each split. The 
> split-limit and split-sort parameters will also be added to control number 
> and order of values in the split field result. 
> Sample syntax:
> {code}
> timeseries(collection1, 
>q="*:*", 
>split="company", 
>split-limit=10, 
>split-sort="avg(price_f) desc",  
>field="timefield", 
>gap="+1DAY", 
>format="-dd-MM" ,
>avg(price_f))
> {code}
> The output of this can be easily pivoted into a matrix and correlated or 
> clustered like the output of the *facet2D* function.  The *diff* function 
> already supports the serial differencing of matrix columns so it's very easy 
> to perform clustering etc.. on this output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14916) Add split parameter to timeseries Streaming Expression

2020-10-06 Thread Aroop (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209051#comment-17209051
 ] 

Aroop commented on SOLR-14916:
--

[~jbernste] this looks very neat!

> Add split parameter to timeseries Streaming Expression
> --
>
> Key: SOLR-14916
> URL: https://issues.apache.org/jira/browse/SOLR-14916
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> Currently the time series function only supports aggregations across the time 
> dimension. This ticket will add the *split* parameter which will add a top 
> level split by categorical field, to produce time lines per each split. The 
> split-limit and split-sort parameters will also be added to control number 
> and order of values in the split field result. 
> Sample syntax:
> {code}
> timeseries(collection1, 
>q="*:*", 
>split="company", 
>split-limit=10, 
>split-sort="avg(price_f) desc",  
>field="timefield", 
>gap="+1DAY", 
>format="-dd-MM" ,
>avg(price_f))
> {code}
> The output of this can be easily pivoted into a matrix and correlated or 
> clustered like the output of the *facet2D* function.  The *diff* function 
> already supports the serial differencing of matrix columns so it's very easy 
> to perform clustering etc.. on this output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14916) Add split parameter to timeseries Streaming Expression

2020-10-06 Thread Joel Bernstein (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-14916:
--
Description: 
Currently the time series function only supports aggregations across the time 
dimension. This ticket will add the *split* parameter which will add a top 
level split by categorical field, to produce time lines per each split. The 
split-limit and split-sort parameters will also be added to control number and 
order of values in the split field result. 

Sample syntax:
{code}
timeseries(collection1, 
   q="*:*", 
   split="company", 
   split-limit=10, 
   split-sort="avg(price_f) desc",  
   field="timefield", 
   gap="+1DAY", 
   format="-dd-MM" ,
   avg(price_f))
{code}

The output of this can be easily pivoted into a matrix and correlated or 
clustered like the output of the *facet2D* function.  The *diff*  and 
*minMaxScale* functions already support operations over matrix rows so it's 
very easy to perform clustering etc.. on this output.






  was:
Currently the time series function only supports aggregations across the time 
dimension. This ticket will add the *split* parameter which will add a top 
level split by categorical field, to produce time lines per each split. The 
split-limit and split-sort parameters will also be added to control number and 
order of values in the split field result. 

Sample syntax:
{code}
timeseries(collection1, 
   q="*:*", 
   split="company", 
   split-limit=10, 
   split-sort="avg(price_f) desc",  
   field="timefield", 
   gap="+1DAY", 
   format="-dd-MM" ,
   avg(price_f))
{code}

The output of this can be easily pivoted into a matrix and correlated or 
clustered like the output of the *facet2D* function.  The *diff* function 
already supports the serial differencing of matrix columns so it's very easy to 
perform clustering etc.. on this output.







> Add split parameter to timeseries Streaming Expression
> --
>
> Key: SOLR-14916
> URL: https://issues.apache.org/jira/browse/SOLR-14916
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> Currently the time series function only supports aggregations across the time 
> dimension. This ticket will add the *split* parameter which will add a top 
> level split by categorical field, to produce time lines per each split. The 
> split-limit and split-sort parameters will also be added to control number 
> and order of values in the split field result. 
> Sample syntax:
> {code}
> timeseries(collection1, 
>q="*:*", 
>split="company", 
>split-limit=10, 
>split-sort="avg(price_f) desc",  
>field="timefield", 
>gap="+1DAY", 
>format="-dd-MM" ,
>avg(price_f))
> {code}
> The output of this can be easily pivoted into a matrix and correlated or 
> clustered like the output of the *facet2D* function.  The *diff*  and 
> *minMaxScale* functions already support operations over matrix rows so it's 
> very easy to perform clustering etc.. on this output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9566) TestApproximationSearchEquivalence.testExclusion fails

2020-10-06 Thread Mayya Sharipova (Jira)

Mayya Sharipova created LUCENE-9566:
---

 Summary: TestApproximationSearchEquivalence.testExclusion fails
 Key: LUCENE-9566
 URL: https://issues.apache.org/jira/browse/LUCENE-9566
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 8.x, master (9.0)
Reporter: Mayya Sharipova


Test fails since the last changes to sort optimization on doc comparator were 
merged. I will mute this test for now.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14916) Add split parameter to timeseries Streaming Expression

2020-10-06 Thread Joel Bernstein (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-14916:
--
Description: 
Currently the time series function only supports aggregations across the time 
dimension. This ticket will add the *split* parameter which will add a top 
level split by categorical field, to produce time lines per each split. The 
split-limit and split-sort parameters will also be added to control the number 
and order of values in the split field result. 

Sample syntax:
{code}
timeseries(collection1, 
   q="*:*", 
   split="company", 
   split-limit=10, 
   split-sort="avg(price_f) desc",  
   field="timefield", 
   gap="+1DAY", 
   format="-dd-MM" ,
   avg(price_f))
{code}

The output of this can be easily pivoted into a matrix and correlated or 
clustered like the output of the *facet2D* function.  The *diff*  and 
*minMaxScale* functions already support operations over matrix rows so it's 
very easy to perform clustering etc.. on this output.






  was:
Currently the time series function only supports aggregations across the time 
dimension. This ticket will add the *split* parameter which will add a top 
level split by categorical field, to produce time lines per each split. The 
split-limit and split-sort parameters will also be added to control number and 
order of values in the split field result. 

Sample syntax:
{code}
timeseries(collection1, 
   q="*:*", 
   split="company", 
   split-limit=10, 
   split-sort="avg(price_f) desc",  
   field="timefield", 
   gap="+1DAY", 
   format="-dd-MM" ,
   avg(price_f))
{code}

The output of this can be easily pivoted into a matrix and correlated or 
clustered like the output of the *facet2D* function.  The *diff*  and 
*minMaxScale* functions already support operations over matrix rows so it's 
very easy to perform clustering etc.. on this output.







> Add split parameter to timeseries Streaming Expression
> --
>
> Key: SOLR-14916
> URL: https://issues.apache.org/jira/browse/SOLR-14916
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> Currently the time series function only supports aggregations across the time 
> dimension. This ticket will add the *split* parameter which will add a top 
> level split by categorical field, to produce time lines per each split. The 
> split-limit and split-sort parameters will also be added to control the 
> number and order of values in the split field result. 
> Sample syntax:
> {code}
> timeseries(collection1, 
>q="*:*", 
>split="company", 
>split-limit=10, 
>split-sort="avg(price_f) desc",  
>field="timefield", 
>gap="+1DAY", 
>format="-dd-MM" ,
>avg(price_f))
> {code}
> The output of this can be easily pivoted into a matrix and correlated or 
> clustered like the output of the *facet2D* function.  The *diff*  and 
> *minMaxScale* functions already support operations over matrix rows so it's 
> very easy to perform clustering etc.. on this output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9540) Investigate double indexing of the fullPathField in the DirectoryTaxonomyWriter

2020-10-06 Thread Michael McCandless (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209076#comment-17209076
 ] 

Michael McCandless commented on LUCENE-9540:


Phew, thanks for bringing closure [~gworah]!

> Investigate double indexing of the fullPathField in the 
> DirectoryTaxonomyWriter
> ---
>
> Key: LUCENE-9540
> URL: https://issues.apache.org/jira/browse/LUCENE-9540
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Gautam Worah
>Priority: Minor
>
> We may have reason to believe that we are double indexing the fullPathField 
> postings item in the DirectoryTaxonomyWriter constructor.
> This should ideally be a StoredField.
> See related discussion in PR https://github.com/apache/lucene-solr/pull/1733/
> Postings are already enabled for facet labels in 
> [FacetsConfig#L364-L399|https://github.com/apache/lucene-solr/blob/master/lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java#L364-L366]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dsmiley opened a new pull request #1953: SOLR-14917: Move DOMUtil and PropertiesUtil to SolrJ

2020-10-06 Thread GitBox



dsmiley opened a new pull request #1953:
URL: https://github.com/apache/lucene-solr/pull/1953


   https://issues.apache.org/jira/browse/SOLR-14917
   
   Not worth a CHANGES.txt.
   On 8x, will keep the classes marked with Deprecated annotation and as 
subclasses so there's no code to maintain



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14916) Add split parameter to timeseries Streaming Expression

2020-10-06 Thread Aroop (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209108#comment-17209108
 ] 

Aroop commented on SOLR-14916:
--

[~jbernste] what possible values of "gap" will we support and will the "format" 
have corresponding valid list of values documented or an enum/constants file to 
that effect created?

> Add split parameter to timeseries Streaming Expression
> --
>
> Key: SOLR-14916
> URL: https://issues.apache.org/jira/browse/SOLR-14916
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> Currently the time series function only supports aggregations across the time 
> dimension. This ticket will add the *split* parameter which will add a top 
> level split by categorical field, to produce time lines per each split. The 
> split-limit and split-sort parameters will also be added to control the 
> number and order of values in the split field result. 
> Sample syntax:
> {code}
> timeseries(collection1, 
>q="*:*", 
>split="company", 
>split-limit=10, 
>split-sort="avg(price_f) desc",  
>field="timefield", 
>gap="+1DAY", 
>format="-dd-MM" ,
>avg(price_f))
> {code}
> The output of this can be easily pivoted into a matrix and correlated or 
> clustered like the output of the *facet2D* function.  The *diff*  and 
> *minMaxScale* functions already support operations over matrix rows so it's 
> very easy to perform clustering etc.. on this output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] kwatters closed pull request #1810: SOLR-14787 - supporting an operator for the payload check query parser to support greater than / less than payload check queries.

2020-10-06 Thread GitBox



kwatters closed pull request #1810:
URL: https://github.com/apache/lucene-solr/pull/1810


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14916) Add split parameter to timeseries Streaming Expression

2020-10-06 Thread Joel Bernstein (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209128#comment-17209128
 ] 

Joel Bernstein commented on SOLR-14916:
---

The gap parameter maps to the JSON facet range faceting gap parameter. I 
believe this documented in Solr's time field docs.

> Add split parameter to timeseries Streaming Expression
> --
>
> Key: SOLR-14916
> URL: https://issues.apache.org/jira/browse/SOLR-14916
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> Currently the time series function only supports aggregations across the time 
> dimension. This ticket will add the *split* parameter which will add a top 
> level split by categorical field, to produce time lines per each split. The 
> split-limit and split-sort parameters will also be added to control the 
> number and order of values in the split field result. 
> Sample syntax:
> {code}
> timeseries(collection1, 
>q="*:*", 
>split="company", 
>split-limit=10, 
>split-sort="avg(price_f) desc",  
>field="timefield", 
>gap="+1DAY", 
>format="-dd-MM" ,
>avg(price_f))
> {code}
> The output of this can be easily pivoted into a matrix and correlated or 
> clustered like the output of the *facet2D* function.  The *diff*  and 
> *minMaxScale* functions already support operations over matrix rows so it's 
> very easy to perform clustering etc.. on this output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-14916) Add split parameter to timeseries Streaming Expression

2020-10-06 Thread Joel Bernstein (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209128#comment-17209128
 ] 

Joel Bernstein edited comment on SOLR-14916 at 10/6/20, 8:18 PM:
-

The gap parameter maps to the JSON facet range faceting gap parameter. I 
believe this is documented in Solr's time field docs.


was (Author: joel.bernstein):
The gap parameter maps to the JSON facet range faceting gap parameter. I 
believe this documented in Solr's time field docs.

> Add split parameter to timeseries Streaming Expression
> --
>
> Key: SOLR-14916
> URL: https://issues.apache.org/jira/browse/SOLR-14916
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> Currently the time series function only supports aggregations across the time 
> dimension. This ticket will add the *split* parameter which will add a top 
> level split by categorical field, to produce time lines per each split. The 
> split-limit and split-sort parameters will also be added to control the 
> number and order of values in the split field result. 
> Sample syntax:
> {code}
> timeseries(collection1, 
>q="*:*", 
>split="company", 
>split-limit=10, 
>split-sort="avg(price_f) desc",  
>field="timefield", 
>gap="+1DAY", 
>format="-dd-MM" ,
>avg(price_f))
> {code}
> The output of this can be easily pivoted into a matrix and correlated or 
> clustered like the output of the *facet2D* function.  The *diff*  and 
> *minMaxScale* functions already support operations over matrix rows so it's 
> very easy to perform clustering etc.. on this output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14916) Add split parameter to timeseries Streaming Expression

2020-10-06 Thread Joel Bernstein (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209133#comment-17209133
 ] 

Joel Bernstein commented on SOLR-14916:
---

This covers some of gap information:

https://lucene.apache.org/solr/guide/8_4/working-with-dates.html#date-math-syntax

> Add split parameter to timeseries Streaming Expression
> --
>
> Key: SOLR-14916
> URL: https://issues.apache.org/jira/browse/SOLR-14916
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> Currently the time series function only supports aggregations across the time 
> dimension. This ticket will add the *split* parameter which will add a top 
> level split by categorical field, to produce time lines per each split. The 
> split-limit and split-sort parameters will also be added to control the 
> number and order of values in the split field result. 
> Sample syntax:
> {code}
> timeseries(collection1, 
>q="*:*", 
>split="company", 
>split-limit=10, 
>split-sort="avg(price_f) desc",  
>field="timefield", 
>gap="+1DAY", 
>format="-dd-MM" ,
>avg(price_f))
> {code}
> The output of this can be easily pivoted into a matrix and correlated or 
> clustered like the output of the *facet2D* function.  The *diff*  and 
> *minMaxScale* functions already support operations over matrix rows so it's 
> very easy to perform clustering etc.. on this output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-14916) Add split parameter to timeseries Streaming Expression

2020-10-06 Thread Joel Bernstein (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209133#comment-17209133
 ] 

Joel Bernstein edited comment on SOLR-14916 at 10/6/20, 8:21 PM:
-

This covers some of the gap information:

https://lucene.apache.org/solr/guide/8_4/working-with-dates.html#date-math-syntax


was (Author: joel.bernstein):
This covers some of gap information:

https://lucene.apache.org/solr/guide/8_4/working-with-dates.html#date-math-syntax

> Add split parameter to timeseries Streaming Expression
> --
>
> Key: SOLR-14916
> URL: https://issues.apache.org/jira/browse/SOLR-14916
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> Currently the time series function only supports aggregations across the time 
> dimension. This ticket will add the *split* parameter which will add a top 
> level split by categorical field, to produce time lines per each split. The 
> split-limit and split-sort parameters will also be added to control the 
> number and order of values in the split field result. 
> Sample syntax:
> {code}
> timeseries(collection1, 
>q="*:*", 
>split="company", 
>split-limit=10, 
>split-sort="avg(price_f) desc",  
>field="timefield", 
>gap="+1DAY", 
>format="-dd-MM" ,
>avg(price_f))
> {code}
> The output of this can be easily pivoted into a matrix and correlated or 
> clustered like the output of the *facet2D* function.  The *diff*  and 
> *minMaxScale* functions already support operations over matrix rows so it's 
> very easy to perform clustering etc.. on this output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on a change in pull request #1952: LUCENE-9565 Fix competitive iteration

2020-10-06 Thread GitBox



jpountz commented on a change in pull request #1952:
URL: https://github.com/apache/lucene-solr/pull/1952#discussion_r500574341



##
File path: lucene/core/src/java/org/apache/lucene/search/Weight.java
##
@@ -266,4 +274,45 @@ static void scoreAll(LeafCollector collector, 
DocIdSetIterator iterator, TwoPhas
 }
   }
 
+  /**
+   * Wraps an internal docIdSetIterator for it to start with docID = -1
+   */
+  protected static class RangeDISIWrapper extends DocIdSetIterator {
+private final DocIdSetIterator in;
+private final int min;
+private final int max;
+private int docID = -1;
+
+public RangeDISIWrapper(DocIdSetIterator in, int max) {
+  this.in = in;
+  this.min = in.docID();
+  this.max = max;
+}
+
+@Override
+public int docID() {
+  return docID;
+}
+
+@Override
+public int nextDoc() throws IOException {
+  return advance(docID + 1);
+}
+
+@Override
+public int advance(int target) throws IOException {
+  target = Math.max(min, target);
+  if (target >= max) {
+return docID = NO_MORE_DOCS;
+  }
+  return docID = in.advance(target);

Review comment:
   it just occurred to me that this implementation is not correct in the 
case that the minimum bound of the range of doc IDs to score is less than the 
current doc ID of the scorer, have you seen any failures with your change? I 
wonder that we would need to do
   ```
   if (target >= scorer.docID()) { return scorer.docID(); }
   ```
   but we should create a test that fails without this





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9564) Format code automatically and enforce it

2020-10-06 Thread Adrien Grand (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209146#comment-17209146
 ] 

Adrien Grand commented on LUCENE-9564:
--

There are downsides to code formatting (I'll miss 2 spaces indentation :)) but 
the pros outweigh the cons, +1. I'm looking forward to imports not being 
reordered when someone using Intellij modifies a file after someone using 
Eclipse, or not having to even look at code style when reviewing pull requests. 
I like how file format javadocs look with this formatter.

> Format code automatically and enforce it
> 
>
> Key: LUCENE-9564
> URL: https://issues.apache.org/jira/browse/LUCENE-9564
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is a trivial change but a bold move. And I'm sure it's not for everyone.
> I started using google java format [1] in my projects a while ago and have 
> never looked back since. It is an oracle-style formatter (doesn't allow 
> customizations or deviations from the defined 'ideal') - this takes some 
> getting used to - but it also eliminates *all* the potential differences 
> between IDEs, configs, etc.  And the formatted code typically looks much 
> better than hand-edited one. It is also verifiable on precommit (so you can't 
> commit code that deviates from what you'd get from automated formatting 
> output).
> The biggest benefit I see is that refactorings become such a joy and keep the 
> code neat, everywhere. Before you commit you just reformat everything 
> automatically, no matter how much you messed it up.
> This isn't a change for everyone. I myself love hand-edited, neat code... but 
> the reality is that with IDE support for automated code changes and so many 
> people with different styles working on the same codebase keeping it neat is 
> a big pain. 
> Checkstyle and other tools are fine for ensuring certain rules but they don't 
> take the burden of formatting off your shoulders. This tool does. 
> Like I said - I had *great* reservations about using it at the beginning but 
> over time got so used to it that I almost can't live without it now. It's 
> like magic - you play with the code in any way you like, then run formatting 
> and it's nice and neat.
> The downside is that automated formatting does imply potential merge problems 
> in backward patches (or any currently existing branches).
> Like I said, it is a bold move. Just throwing this for your consideration.
> -I've added a PR that adds spotless but it's not ready; some files would have 
> to be excluded as they currently violate header rules.-
> A more interesting thing is here where the current code is automatically 
> reformatted - this branch is for eyeballing only.
> https://github.com/dweiss/lucene-solr/compare/LUCENE-9564...dweiss:LUCENE-9564-example
> [1] https://google.github.io/styleguide/javaguide.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14916) Add split parameter to timeseries Streaming Expression

2020-10-06 Thread Aroop (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209147#comment-17209147
 ] 

Aroop commented on SOLR-14916:
--

Thanks Joel

> Add split parameter to timeseries Streaming Expression
> --
>
> Key: SOLR-14916
> URL: https://issues.apache.org/jira/browse/SOLR-14916
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> Currently the time series function only supports aggregations across the time 
> dimension. This ticket will add the *split* parameter which will add a top 
> level split by categorical field, to produce time lines per each split. The 
> split-limit and split-sort parameters will also be added to control the 
> number and order of values in the split field result. 
> Sample syntax:
> {code}
> timeseries(collection1, 
>q="*:*", 
>split="company", 
>split-limit=10, 
>split-sort="avg(price_f) desc",  
>field="timefield", 
>gap="+1DAY", 
>format="-dd-MM" ,
>avg(price_f))
> {code}
> The output of this can be easily pivoted into a matrix and correlated or 
> clustered like the output of the *facet2D* function.  The *diff*  and 
> *minMaxScale* functions already support operations over matrix rows so it's 
> very easy to perform clustering etc.. on this output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

1 2 >

1 - 100 of 139 matches

Mail list logo