[GitHub] [lucene] dweiss commented on pull request #394: LUCENE-9997: write release revision to system temp dir

2021-10-19 Thread GitBox


dweiss commented on pull request #394:
URL: https://github.com/apache/lucene/pull/394#issuecomment-946429811


   Thanks Tomoko. What is this file for though? Is it really needed at all 
(can't it be a variable)? It'd also help to use Python's temporary-file 
facilities so that it's system-agnostic (I don't believe this will even run on 
Windows properly, but we can try not to make it worse).
   
   https://docs.python.org/3/library/tempfile.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10185) gradle check fails on java 17 (security manager deprecation)

2021-10-19 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430362#comment-17430362
 ] 

Dawid Weiss commented on LUCENE-10185:
--

Oh... completely forgot about polymorphic signatures ("Method handle 
compilation", [1])... Damn, this is complex. Thanks for digging deep, Uwe.

[1] 
https://docs.oracle.com/en/java/javase/16/docs/api/java.base/java/lang/invoke/MethodHandle.html

> gradle check fails on java 17 (security manager deprecation)
> 
>
> Key: LUCENE-10185
> URL: https://issues.apache.org/jira/browse/LUCENE-10185
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
> Fix For: main (9.0)
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> I don't think we should add SuppressWarnings here, instead fix our ECJ linter 
> configuration. Seems like we should be specifying something similar to 
> "-release 11" and it shouldn't care about the new deprecations from java 17. 
> Or if we can't do that, maybe we should disable the "deprecated for removal" 
> check in ECJ entirely?
> {noformat}
> > Task :lucene:core:ecjLintMain
> --
> 1. ERROR in 
> /home/rmuir/workspace/lucene/lucene/core/src/java/org/apache/lucene/util/NamedThreadFactory.java
>  (at line 42)
> final SecurityManager s = System.getSecurityManager();
>   ^^^
> The type SecurityManager has been deprecated since version 17 and marked for 
> removal
> --
> 2. ERROR in 
> /home/rmuir/workspace/lucene/lucene/core/src/java/org/apache/lucene/util/NamedThreadFactory.java
>  (at line 42)
> final SecurityManager s = System.getSecurityManager();
>  
> The method getSecurityManager() from the type System has been deprecated 
> since version 17 and marked for removal
> --
> 3. ERROR in 
> /home/rmuir/workspace/lucene/lucene/core/src/java/org/apache/lucene/util/NamedThreadFactory.java
>  (at line 43)
> group = (s != null) ? s.getThreadGroup() : 
> Thread.currentThread().getThreadGroup();
> 
> The method getThreadGroup() from the type SecurityManager has been deprecated 
> and marked for removal
> --
> --
> 4. ERROR in 
> /home/rmuir/workspace/lucene/lucene/core/src/java/org/apache/lucene/util/RamUsageEstimator.java
>  (at line 23)
> import java.security.AccessControlException;
>
> The type AccessControlException has been deprecated since version 17 and 
> marked for removal
> --
> 5. ERROR in 
> /home/rmuir/workspace/lucene/lucene/core/src/java/org/apache/lucene/util/RamUsageEstimator.java
>  (at line 24)
> import java.security.AccessController;
>^^
> The type AccessController has been deprecated since version 17 and marked for 
> removal
> --
> 6. ERROR in 
> /home/rmuir/workspace/lucene/lucene/core/src/java/org/apache/lucene/util/RamUsageEstimator.java
>  (at line 574)
> AccessController.doPrivileged((PrivilegedAction) 
> target::getDeclaredFields);
> 
> The type AccessController has been deprecated since version 17 and marked for 
> removal
> --
> 7. ERROR in 
> /home/rmuir/workspace/lucene/lucene/core/src/java/org/apache/lucene/util/RamUsageEstimator.java
>  (at line 574)
> AccessController.doPrivileged((PrivilegedAction) 
> target::getDeclaredFields);
>  
> ^^^
> The method doPrivileged(PrivilegedAction) from the type 
> AccessController has been deprecated and marked for removal
> --
> 8. ERROR in 
> /home/rmuir/workspace/lucene/lucene/core/src/java/org/apache/lucene/util/RamUsageEstimator.java
>  (at line 575)
> } catch (AccessControlException e) {
>  ^^
> The type AccessControlException has been deprecated since version 17 and 
> marked for removal
> --
> --
> 9. ERROR in 
> /home/rmuir/workspace/lucene/lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java
>  (at line 33)
> import java.security.AccessController;
>^^
> The type AccessController has been deprecated since version 17 and marked for 
> removal
> --
> 10. ERROR in 
> /home/rmuir/workspace/lucene/lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java
>  (at line 337)
> AccessController.doPrivileged((PrivilegedAction) 
> MMapDirectory::unmapHackImpl);
> 
> The type AccessController has been deprecated since version 17 and marked for 
> removal
> --
>

[GitHub] [lucene] dweiss commented on pull request #391: LUCENE-9997 Second pass smoketester fixes for 9.0

2021-10-19 Thread GitBox


dweiss commented on pull request #391:
URL: https://github.com/apache/lucene/pull/391#issuecomment-946437537


   > Or the script could run the main build in parallel and then run just the 
signing serially.
   
   Sure. Split in two - I also suggested removing "clean" because just 
rebuilding from scratch should always yield correct task outputs (this isn't 
ant). So this sequence:
   ```
   gradlew assembleRelease
   gradlew assembleRelease -Psign --max-workers 1
   ```
   will rerun some tasks but will sign in a single worker.
   
   We could also order all signing tasks within gradle code (so that they can't 
run in parallel, no matter what) but it seems like an unnecessary complexity 
given the infrequent use of the script. I'd rather do the above (or fall back 
to just specifying --max-workers X, where X is small-ish for gpg signing).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dweiss edited a comment on pull request #391: LUCENE-9997 Second pass smoketester fixes for 9.0

2021-10-19 Thread GitBox


dweiss edited a comment on pull request #391:
URL: https://github.com/apache/lucene/pull/391#issuecomment-946437537


   > Or the script could run the main build in parallel and then run just the 
signing serially.
   
   Sure. Split in two - I also suggested removing "clean" because rebuilding 
from any state should always yield correct task outputs (this isn't ant where 
you have to clean leftovers over and over). So this sequence:
   ```
   gradlew assembleRelease
   gradlew assembleRelease -Psign --max-workers 1
   ```
   will rerun some tasks but will sign in a single worker.
   
   We could also order all signing tasks within gradle code (so that they can't 
run in parallel, no matter what) but it seems like an unnecessary complexity 
given the infrequent use of the script. I'd rather do the above (or fall back 
to just specifying --max-workers X, where X is small-ish for gpg signing).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10166) Move relevant content of README.txt files from subprojects into package javadocs

2021-10-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430375#comment-17430375
 ] 

ASF subversion and git services commented on LUCENE-10166:
--

Commit e290f91bb233f33cde4b2249d676298d5740e8b1 in lucene's branch 
refs/heads/main from Dawid Weiss
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=e290f91 ]

LUCENE-10166: removed module-level README.txt and modified a few links, removed 
a few obsolete instructions from 20 years ago. (#379)



> Move relevant content of README.txt files from subprojects into package 
> javadocs
> 
>
> Key: LUCENE-10166
> URL: https://issues.apache.org/jira/browse/LUCENE-10166
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Dawid Weiss
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-10186) Maven artifacts built by gradle lack required META-INF content

2021-10-19 Thread Jira
Jan Høydahl created LUCENE-10186:


 Summary: Maven artifacts built by gradle lack required META-INF 
content
 Key: LUCENE-10186
 URL: https://issues.apache.org/jira/browse/LUCENE-10186
 Project: Lucene - Core
  Issue Type: Bug
  Components: general/build
Reporter: Jan Høydahl


Spinoff from LUCENE-9997

Turns out that the maven artifacts generated by gradle lack LICENSE and NOTICE 
files in META-INF, and also have empty MANIFEST.MF. Smoketester error:
{code:java}
RuntimeError: JAR file 
"/tmp/smoke_lucene_9.0.0_018642ff84f88a2438b32d6aca5d5d35f453e1fb_2/maven/org/apache/lucene/lucene-analysis-smartcn/9.0.0/lucene-analysis-smartcn-9.0.0-sources.jar"
 is missing META-INF/NOTICE.txt {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dweiss merged pull request #379: LUCENE-10166: removed module-level README.txt and modified a few links.

2021-10-19 Thread GitBox


dweiss merged pull request #379:
URL: https://github.com/apache/lucene/pull/379


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10186) Maven artifacts built by gradle lack required META-INF content

2021-10-19 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430376#comment-17430376
 ] 

Dawid Weiss commented on LUCENE-10186:
--

I'll handle this today, Jan.

> Maven artifacts built by gradle lack required META-INF content
> --
>
> Key: LUCENE-10186
> URL: https://issues.apache.org/jira/browse/LUCENE-10186
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/build
>Reporter: Jan Høydahl
>Priority: Major
>
> Spinoff from LUCENE-9997
> Turns out that the maven artifacts generated by gradle lack LICENSE and 
> NOTICE files in META-INF, and also have empty MANIFEST.MF. Smoketester error:
> {code:java}
> RuntimeError: JAR file 
> "/tmp/smoke_lucene_9.0.0_018642ff84f88a2438b32d6aca5d5d35f453e1fb_2/maven/org/apache/lucene/lucene-analysis-smartcn/9.0.0/lucene-analysis-smartcn-9.0.0-sources.jar"
>  is missing META-INF/NOTICE.txt {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-10166) Move relevant content of README.txt files from subprojects into package javadocs

2021-10-19 Thread Dawid Weiss (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved LUCENE-10166.
--
Fix Version/s: main (9.0)
 Assignee: Dawid Weiss
   Resolution: Fixed

> Move relevant content of README.txt files from subprojects into package 
> javadocs
> 
>
> Key: LUCENE-10166
> URL: https://issues.apache.org/jira/browse/LUCENE-10166
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Major
> Fix For: main (9.0)
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-10186) Maven artifacts built by gradle lack required META-INF content

2021-10-19 Thread Dawid Weiss (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss reassigned LUCENE-10186:


Assignee: Dawid Weiss

> Maven artifacts built by gradle lack required META-INF content
> --
>
> Key: LUCENE-10186
> URL: https://issues.apache.org/jira/browse/LUCENE-10186
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/build
>Reporter: Jan Høydahl
>Assignee: Dawid Weiss
>Priority: Major
>
> Spinoff from LUCENE-9997
> Turns out that the maven artifacts generated by gradle lack LICENSE and 
> NOTICE files in META-INF, and also have empty MANIFEST.MF. Smoketester error:
> {code:java}
> RuntimeError: JAR file 
> "/tmp/smoke_lucene_9.0.0_018642ff84f88a2438b32d6aca5d5d35f453e1fb_2/maven/org/apache/lucene/lucene-analysis-smartcn/9.0.0/lucene-analysis-smartcn-9.0.0-sources.jar"
>  is missing META-INF/NOTICE.txt {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10186) Maven artifacts built by gradle lack required META-INF content

2021-10-19 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430377#comment-17430377
 ] 

Dawid Weiss commented on LUCENE-10186:
--

Do source packages need a manifest though? Seems odd to me. These are not 
really binary JARs - they're convenience for IDEs?

> Maven artifacts built by gradle lack required META-INF content
> --
>
> Key: LUCENE-10186
> URL: https://issues.apache.org/jira/browse/LUCENE-10186
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/build
>Reporter: Jan Høydahl
>Assignee: Dawid Weiss
>Priority: Major
>
> Spinoff from LUCENE-9997
> Turns out that the maven artifacts generated by gradle lack LICENSE and 
> NOTICE files in META-INF, and also have empty MANIFEST.MF. Smoketester error:
> {code:java}
> RuntimeError: JAR file 
> "/tmp/smoke_lucene_9.0.0_018642ff84f88a2438b32d6aca5d5d35f453e1fb_2/maven/org/apache/lucene/lucene-analysis-smartcn/9.0.0/lucene-analysis-smartcn-9.0.0-sources.jar"
>  is missing META-INF/NOTICE.txt {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10186) Maven artifacts built by gradle lack required META-INF content

2021-10-19 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430378#comment-17430378
 ] 

Dawid Weiss commented on LUCENE-10186:
--

We actually explicitly skip those JARs from receiving the manifest:
{code}
// Apply the manifest to any JAR or WAR file created by any project,
// excluding those explicitly listed.
tasks.withType(Jar)
  .matching { t -> !["sourcesJar", "javadocJar"].contains(t.name) }
{code}

> Maven artifacts built by gradle lack required META-INF content
> --
>
> Key: LUCENE-10186
> URL: https://issues.apache.org/jira/browse/LUCENE-10186
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/build
>Reporter: Jan Høydahl
>Assignee: Dawid Weiss
>Priority: Major
>
> Spinoff from LUCENE-9997
> Turns out that the maven artifacts generated by gradle lack LICENSE and 
> NOTICE files in META-INF, and also have empty MANIFEST.MF. Smoketester error:
> {code:java}
> RuntimeError: JAR file 
> "/tmp/smoke_lucene_9.0.0_018642ff84f88a2438b32d6aca5d5d35f453e1fb_2/maven/org/apache/lucene/lucene-analysis-smartcn/9.0.0/lucene-analysis-smartcn-9.0.0-sources.jar"
>  is missing META-INF/NOTICE.txt {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] janhoy commented on pull request #391: LUCENE-9997 Second pass smoketester fixes for 9.0

2021-10-19 Thread GitBox


janhoy commented on pull request #391:
URL: https://github.com/apache/lucene/pull/391#issuecomment-946453138


   @mocobeta I got the same as you - looks like LICENSE and NOTICE are not 
copied into the maven jars. Strange, since they exist in the binary-release 
jars. Looks like maven task re-build the jars.. Could not the maven task use 
the pre-built jars that already have NOTICE and LICENSE?
   
   I also notice that `MANIFEST.MF` is empty in the maven jars, simply one 
line, which is also wrong:
   ```
   Manifest-Version: 1.0
   ```
   I created https://issues.apache.org/jira/browse/LUCENE-10186 for maven 
artifacts.
   Re-opened LUCENE-10174 for the buildAndPushRelease changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Reopened] (LUCENE-10174) Update buildAndPushRelease.py for new gradle build

2021-10-19 Thread Jira


 [ 
https://issues.apache.org/jira/browse/LUCENE-10174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl reopened LUCENE-10174:
--

Re-opening to enhance the 'assembleRelease' step, to avoid OOM in gpg-agent. 
Will split in two commands and skip 'clean':
{code:java}
gradlew assembleRelease ...
gradlew assembleRelease -Psign --max-workers 1 ... {code}

> Update buildAndPushRelease.py for new gradle build
> --
>
> Key: LUCENE-10174
> URL: https://issues.apache.org/jira/browse/LUCENE-10174
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
> Fix For: main (9.0)
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> With LUCENE-9488 and LUCENE-10173 the gradle build was polished to properly 
> build source and binary artifacts, and sign those using either gpg tool or a 
> built-in java-based signing plugin. See 
> [https://github.com/apache/lucene/blob/main/help/publishing.txt]
> This jira will update {{buildAndPushRelease.py}} script to use the correct 
> build parameters. It will also add cmdline args to choose between gpg and 
> built-in (gpg default), and to supply the location of {{gpgHome}} if you do 
> not use gpg. We'll also add an option to NOT prompt for passphrase in the 
> python script, which will fallback to defaults (gpg-agent, env.vars or 
> gradle.properties).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10186) Maven artifacts built by gradle lack required META-INF content

2021-10-19 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430383#comment-17430383
 ] 

Dawid Weiss commented on LUCENE-10186:
--

I filed a PR. I think this was intentional (by me) not to include manifests in 
these files - I didn't see the point.

> Maven artifacts built by gradle lack required META-INF content
> --
>
> Key: LUCENE-10186
> URL: https://issues.apache.org/jira/browse/LUCENE-10186
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/build
>Reporter: Jan Høydahl
>Assignee: Dawid Weiss
>Priority: Major
>
> Spinoff from LUCENE-9997
> Turns out that the maven artifacts generated by gradle lack LICENSE and 
> NOTICE files in META-INF, and also have empty MANIFEST.MF. Smoketester error:
> {code:java}
> RuntimeError: JAR file 
> "/tmp/smoke_lucene_9.0.0_018642ff84f88a2438b32d6aca5d5d35f453e1fb_2/maven/org/apache/lucene/lucene-analysis-smartcn/9.0.0/lucene-analysis-smartcn-9.0.0-sources.jar"
>  is missing META-INF/NOTICE.txt {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10186) Maven artifacts built by gradle lack required META-INF content

2021-10-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430386#comment-17430386
 ] 

ASF subversion and git services commented on LUCENE-10186:
--

Commit 6c21862a552cccbb8509e4383ac8c6d10c68137f in lucene's branch 
refs/heads/main from Dawid Weiss
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=6c21862 ]

LUCENE-10186: Include manifest and legalese in source and javadoc jars. (#395)



> Maven artifacts built by gradle lack required META-INF content
> --
>
> Key: LUCENE-10186
> URL: https://issues.apache.org/jira/browse/LUCENE-10186
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/build
>Reporter: Jan Høydahl
>Assignee: Dawid Weiss
>Priority: Major
>
> Spinoff from LUCENE-9997
> Turns out that the maven artifacts generated by gradle lack LICENSE and 
> NOTICE files in META-INF, and also have empty MANIFEST.MF. Smoketester error:
> {code:java}
> RuntimeError: JAR file 
> "/tmp/smoke_lucene_9.0.0_018642ff84f88a2438b32d6aca5d5d35f453e1fb_2/maven/org/apache/lucene/lucene-analysis-smartcn/9.0.0/lucene-analysis-smartcn-9.0.0-sources.jar"
>  is missing META-INF/NOTICE.txt {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dweiss merged pull request #395: LUCENE-10186: Include manifest and legalese in source and javadoc jars.

2021-10-19 Thread GitBox


dweiss merged pull request #395:
URL: https://github.com/apache/lucene/pull/395


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-10186) Maven artifacts built by gradle lack required META-INF content

2021-10-19 Thread Dawid Weiss (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved LUCENE-10186.
--
Fix Version/s: main (9.0)
   Resolution: Fixed

> Maven artifacts built by gradle lack required META-INF content
> --
>
> Key: LUCENE-10186
> URL: https://issues.apache.org/jira/browse/LUCENE-10186
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/build
>Reporter: Jan Høydahl
>Assignee: Dawid Weiss
>Priority: Major
> Fix For: main (9.0)
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Spinoff from LUCENE-9997
> Turns out that the maven artifacts generated by gradle lack LICENSE and 
> NOTICE files in META-INF, and also have empty MANIFEST.MF. Smoketester error:
> {code:java}
> RuntimeError: JAR file 
> "/tmp/smoke_lucene_9.0.0_018642ff84f88a2438b32d6aca5d5d35f453e1fb_2/maven/org/apache/lucene/lucene-analysis-smartcn/9.0.0/lucene-analysis-smartcn-9.0.0-sources.jar"
>  is missing META-INF/NOTICE.txt {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10174) Update buildAndPushRelease.py for new gradle build

2021-10-19 Thread Jira


[ 
https://issues.apache.org/jira/browse/LUCENE-10174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430393#comment-17430393
 ] 

Jan Høydahl commented on LUCENE-10174:
--

When we skip "clean" step, the task does not need to re-compile, only to 
assemble the tar/zip and sign. I tested locally and {{--max-workers 8}} (my 
default) results in OOM, but {{--max-workers 4}} works fine. So I'll settle on 
{{--max-workers 2}} which on my laptop takes less than a minute, which should 
be acceptable for a release. The imporant thing to parallellize is the tests.

> Update buildAndPushRelease.py for new gradle build
> --
>
> Key: LUCENE-10174
> URL: https://issues.apache.org/jira/browse/LUCENE-10174
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
> Fix For: main (9.0)
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> With LUCENE-9488 and LUCENE-10173 the gradle build was polished to properly 
> build source and binary artifacts, and sign those using either gpg tool or a 
> built-in java-based signing plugin. See 
> [https://github.com/apache/lucene/blob/main/help/publishing.txt]
> This jira will update {{buildAndPushRelease.py}} script to use the correct 
> build parameters. It will also add cmdline args to choose between gpg and 
> built-in (gpg default), and to supply the location of {{gpgHome}} if you do 
> not use gpg. We'll also add an option to NOT prompt for passphrase in the 
> python script, which will fallback to defaults (gpg-agent, env.vars or 
> gradle.properties).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] janhoy opened a new pull request #396: LUCENE-10174 BuildAndPushRelease additional improvements

2021-10-19 Thread GitBox


janhoy opened a new pull request #396:
URL: https://github.com/apache/lucene/pull/396


   https://issues.apache.org/jira/browse/LUCENE-10174
   
   Makes assembleRelease OOM safe with max-workers=2 and faster by avoiding 
'clean'


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10174) Update buildAndPushRelease.py for new gradle build

2021-10-19 Thread Jira


[ 
https://issues.apache.org/jira/browse/LUCENE-10174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430393#comment-17430393
 ] 

Jan Høydahl edited comment on LUCENE-10174 at 10/19/21, 8:23 AM:
-

When we skip "clean" step, the task does not need to re-compile, only to 
assemble the tar/zip and sign. I tested locally and {{--max-workers 8}} (my 
default) results in OOM, but {{\--max-workers 4}} works fine. So I'll settle on 
{{\--max-workers 2}} which on my laptop takes less than a minute, which should 
be acceptable for a release. The imporant thing to parallellize is the tests.


was (Author: janhoy):
When we skip "clean" step, the task does not need to re-compile, only to 
assemble the tar/zip and sign. I tested locally and {{--max-workers 8}} (my 
default) results in OOM, but {{--max-workers 4}} works fine. So I'll settle on 
{{--max-workers 2}} which on my laptop takes less than a minute, which should 
be acceptable for a release. The imporant thing to parallellize is the tests.

> Update buildAndPushRelease.py for new gradle build
> --
>
> Key: LUCENE-10174
> URL: https://issues.apache.org/jira/browse/LUCENE-10174
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
> Fix For: main (9.0)
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> With LUCENE-9488 and LUCENE-10173 the gradle build was polished to properly 
> build source and binary artifacts, and sign those using either gpg tool or a 
> built-in java-based signing plugin. See 
> [https://github.com/apache/lucene/blob/main/help/publishing.txt]
> This jira will update {{buildAndPushRelease.py}} script to use the correct 
> build parameters. It will also add cmdline args to choose between gpg and 
> built-in (gpg default), and to supply the location of {{gpgHome}} if you do 
> not use gpg. We'll also add an option to NOT prompt for passphrase in the 
> python script, which will fallback to defaults (gpg-agent, env.vars or 
> gradle.properties).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10174) Update buildAndPushRelease.py for new gradle build

2021-10-19 Thread Jira


[ 
https://issues.apache.org/jira/browse/LUCENE-10174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430396#comment-17430396
 ] 

Jan Høydahl commented on LUCENE-10174:
--

See PR https://github.com/apache/lucene/pull/396

> Update buildAndPushRelease.py for new gradle build
> --
>
> Key: LUCENE-10174
> URL: https://issues.apache.org/jira/browse/LUCENE-10174
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
> Fix For: main (9.0)
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> With LUCENE-9488 and LUCENE-10173 the gradle build was polished to properly 
> build source and binary artifacts, and sign those using either gpg tool or a 
> built-in java-based signing plugin. See 
> [https://github.com/apache/lucene/blob/main/help/publishing.txt]
> This jira will update {{buildAndPushRelease.py}} script to use the correct 
> build parameters. It will also add cmdline args to choose between gpg and 
> built-in (gpg default), and to supply the location of {{gpgHome}} if you do 
> not use gpg. We'll also add an option to NOT prompt for passphrase in the 
> python script, which will fallback to defaults (gpg-agent, env.vars or 
> gradle.properties).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] janhoy edited a comment on pull request #391: LUCENE-9997 Second pass smoketester fixes for 9.0

2021-10-19 Thread GitBox


janhoy edited a comment on pull request #391:
URL: https://github.com/apache/lucene/pull/391#issuecomment-946453138


   @mocobeta I got the same as you - looks like LICENSE and NOTICE are not 
copied into the maven jars. Strange, since they exist in the binary-release 
jars. Looks like maven task re-build the jars.. Could not the maven task use 
the pre-built jars that already have NOTICE and LICENSE?
   
   I also notice that `MANIFEST.MF` is empty in the maven jars, simply one 
line, which is also wrong:
   ```
   Manifest-Version: 1.0
   ```
   I created https://issues.apache.org/jira/browse/LUCENE-10186 for maven 
artifacts.
   Re-opened LUCENE-10174 for the buildAndPushRelease changes, 
https://github.com/apache/lucene/pull/396.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] janhoy commented on pull request #391: LUCENE-9997 Second pass smoketester fixes for 9.0

2021-10-19 Thread GitBox


janhoy commented on pull request #391:
URL: https://github.com/apache/lucene/pull/391#issuecomment-946481179


   @mocobeta I think you asked whether the `./gradlew` commands are windows 
safe. I really don't know. I see tons of `/` in that script so perhaps python 
translates it? I don't use Windows so cannot test.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-10182) TestRamUsageEstimator asserts trivial equality

2021-10-19 Thread Uwe Schindler (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler resolved LUCENE-10182.

Resolution: Fixed

> TestRamUsageEstimator asserts trivial equality
> --
>
> Key: LUCENE-10182
> URL: https://issues.apache.org/jira/browse/LUCENE-10182
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Stefan Vodita
>Assignee: Uwe Schindler
>Priority: Major
> Fix For: main (9.0)
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {{TestRamUsageEstimator.testStaticOverloads}} has serveral lines like:
> {code:java}
> assertEquals(sizeOf(array), sizeOf((Object) array));
> {code}
> Both calls to {{sizeOf()}} fall back on {{RamUsageTester.sizeOf}}, making the 
> 2 calls identical. Instead, we would want one of the calls to go to 
> {{RamUsageEstimator.sizeOf}}.
>  
> This issue came up while working on LUCENE-10129. A possible solution, as per 
> [~uschindler]'s suggestion, would be to remove the static import
> {code:java}
> import static org.apache.lucene.util.RamUsageTester.sizeOf;
> {code}
> Instead, we could be explicit on which method we are calling, like:
> {code:java}
> assertEquals(RamUsageEstimator.sizeOf(array), RamUsageTester.sizeOf(array));
> {code}
> This could be replicated for other potentially confusing cases in the test 
> class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10182) TestRamUsageEstimator asserts trivial equality

2021-10-19 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430408#comment-17430408
 ] 

Uwe Schindler commented on LUCENE-10182:


Hi, I think we should not backport the changes. The last Lucene/Solr 8.11 
release is on the go already, so it is not worth the trouble. The code may be 
untested, but that does not mean there's a bug in productive code.

What I figured out when reading through the patch again: In 
TestRamUsageEstimator, the order of assertEquals is wrong: The expected value 
should come first (what RamUsageTester returns) and the value which we want to 
verify (the RamUsageEstimator static overload) should be second parameter. But 
that's just nitpicking. If you want to fix, make a PR.

So I would close this issue now.

> TestRamUsageEstimator asserts trivial equality
> --
>
> Key: LUCENE-10182
> URL: https://issues.apache.org/jira/browse/LUCENE-10182
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Stefan Vodita
>Assignee: Uwe Schindler
>Priority: Major
> Fix For: main (9.0)
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {{TestRamUsageEstimator.testStaticOverloads}} has serveral lines like:
> {code:java}
> assertEquals(sizeOf(array), sizeOf((Object) array));
> {code}
> Both calls to {{sizeOf()}} fall back on {{RamUsageTester.sizeOf}}, making the 
> 2 calls identical. Instead, we would want one of the calls to go to 
> {{RamUsageEstimator.sizeOf}}.
>  
> This issue came up while working on LUCENE-10129. A possible solution, as per 
> [~uschindler]'s suggestion, would be to remove the static import
> {code:java}
> import static org.apache.lucene.util.RamUsageTester.sizeOf;
> {code}
> Instead, we could be explicit on which method we are calling, like:
> {code:java}
> assertEquals(RamUsageEstimator.sizeOf(array), RamUsageTester.sizeOf(array));
> {code}
> This could be replicated for other potentially confusing cases in the test 
> class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9997) Revisit smoketester for 9.0 build

2021-10-19 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430415#comment-17430415
 ] 

Dawid Weiss commented on LUCENE-9997:
-

I ran the full check with local keys and dev mode (on 
jan/lucene9997-smoketester-part-2). 
{code}
SUCCESS! [0:07:30.308004]
{code}
Two things that are odd:
- r-- permissions on all maven artifact files - these are slightly odd and 
prevent those files from being removed (from /tmp).
- that 'rev.txt' file is annoying. I'm not sure what it's for and don't have 
the time to check, but it looks like a bug.


> Revisit smoketester for 9.0 build
> -
>
> Key: LUCENE-9997
> URL: https://issues.apache.org/jira/browse/LUCENE-9997
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Robert Muir
>Priority: Major
> Attachments: image-2021-10-12-12-47-11-480.png, 
> image-2021-10-12-12-48-15-373.png
>
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> Currently we have a (great) {{dev-tools/scripts/smokeTester.py}} that will 
> perform automated tests against a release.
> This was developed with the ant build process in mind.
> This issue is just about considering the automated checks we do here, maybe 
> some of them can be done efficiently in the gradle build in earlier places: 
> this would be a large improvement!
> Obviously some of them (e.g. GPG release key verifications) are really 
> specific to the artifacts in question. These are most important to release 
> verification, as that is actually the only place we can check it.
> Any other checks (and I do tend to think, this checker should try to be 
> thorough, invoking gradle etc), should be stuff we regularly test in 
> PRs/nightly/builds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dweiss commented on pull request #391: LUCENE-9997 Second pass smoketester fixes for 9.0

2021-10-19 Thread GitBox


dweiss commented on pull request #391:
URL: https://github.com/apache/lucene/pull/391#issuecomment-946494828


   I use Windows - these scripts are not compatible. I don't think we have to 
make it a priority to make them compatible. Too many variables to think of.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] stefanvodita opened a new pull request #397: LUCENE-10182: Order assertion parameters correctly

2021-10-19 Thread GitBox


stefanvodita opened a new pull request #397:
URL: https://github.com/apache/lucene/pull/397


   # Description
   
   Reorder assert parameters in `TestRamUsageEstimator.testStaticOverloads` 
like `assertEquals(expected, actual)` instead of `assertEquals(actual, 
expected)`.
   
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/lucene/HowToContribute) and my code 
conforms to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [x] I have given Lucene maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [x] I have developed this patch against the `main` branch.
   - [x] I have run `./gradlew check`.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10182) TestRamUsageEstimator asserts trivial equality

2021-10-19 Thread Stefan Vodita (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430430#comment-17430430
 ] 

Stefan Vodita commented on LUCENE-10182:


Might as well fix the assertion order, since it's a small change. 
[Here|https://github.com/apache/lucene/pull/397] is the PR for it.

> TestRamUsageEstimator asserts trivial equality
> --
>
> Key: LUCENE-10182
> URL: https://issues.apache.org/jira/browse/LUCENE-10182
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Stefan Vodita
>Assignee: Uwe Schindler
>Priority: Major
> Fix For: main (9.0)
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {{TestRamUsageEstimator.testStaticOverloads}} has serveral lines like:
> {code:java}
> assertEquals(sizeOf(array), sizeOf((Object) array));
> {code}
> Both calls to {{sizeOf()}} fall back on {{RamUsageTester.sizeOf}}, making the 
> 2 calls identical. Instead, we would want one of the calls to go to 
> {{RamUsageEstimator.sizeOf}}.
>  
> This issue came up while working on LUCENE-10129. A possible solution, as per 
> [~uschindler]'s suggestion, would be to remove the static import
> {code:java}
> import static org.apache.lucene.util.RamUsageTester.sizeOf;
> {code}
> Instead, we could be explicit on which method we are calling, like:
> {code:java}
> assertEquals(RamUsageEstimator.sizeOf(array), RamUsageTester.sizeOf(array));
> {code}
> This could be replicated for other potentially confusing cases in the test 
> class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] mocobeta commented on pull request #391: LUCENE-9997 Second pass smoketester fixes for 9.0

2021-10-19 Thread GitBox


mocobeta commented on pull request #391:
URL: https://github.com/apache/lucene/pull/391#issuecomment-946544474


   @janhoy @dweiss 
   
   Commands with `./` do not work on Windows. Yesterday I just tested it on my 
Windows OS; I saw Command Prompt and PowerShell do not support `./gradlew`.
   
   After I noticed this comment in the file, I deleted the comment... sorry for 
the noise.
   
https://github.com/apache/lucene/blob/6c21862a552cccbb8509e4383ac8c6d10c68137f/dev-tools/scripts/smokeTestRelease.py#L43-L45
   
   I think it could be labor to make the scripts OS-agnostic. As for Windows, 
instead of fully supporting Windows perhaps we could test it on WSL2  then 
throw away Cygwin? I have little experience with it, but it seems to work just 
as plain Ubuntu and it's easier to install than Cygwin (its I/O performance was 
terrible a few years ago, but it should have improved...).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9997) Revisit smoketester for 9.0 build

2021-10-19 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430442#comment-17430442
 ] 

Dawid Weiss commented on LUCENE-9997:
-

The rev.txt file is for running on a "prepared" package - it saves the git 
revision in a separate file because otherwise it wouldn't have any means to 
read it back from.
{code}
  parser.add_argument('--no-prepare', dest='prepare', default=True, 
action='store_false',
  help='Use the already built release in the provided 
checkout')
{code}

I think we should make the git revision part of the distribution artifacts - 
then the smoke tester can read it directly from the distribution artifact 
release folder. Moreover, the git revision could also be part of the "source" 
distribution of Lucene - then the build scripts can be tweaked to actually work 
without the git clone (on the true "source" distribution) by simulating the git 
revision read from such a file.

Thoughts?

> Revisit smoketester for 9.0 build
> -
>
> Key: LUCENE-9997
> URL: https://issues.apache.org/jira/browse/LUCENE-9997
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Robert Muir
>Priority: Major
> Attachments: image-2021-10-12-12-47-11-480.png, 
> image-2021-10-12-12-48-15-373.png
>
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> Currently we have a (great) {{dev-tools/scripts/smokeTester.py}} that will 
> perform automated tests against a release.
> This was developed with the ant build process in mind.
> This issue is just about considering the automated checks we do here, maybe 
> some of them can be done efficiently in the gradle build in earlier places: 
> this would be a large improvement!
> Obviously some of them (e.g. GPG release key verifications) are really 
> specific to the artifacts in question. These are most important to release 
> verification, as that is actually the only place we can check it.
> Any other checks (and I do tend to think, this checker should try to be 
> thorough, invoking gradle etc), should be stuff we regularly test in 
> PRs/nightly/builds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] mocobeta edited a comment on pull request #391: LUCENE-9997 Second pass smoketester fixes for 9.0

2021-10-19 Thread GitBox


mocobeta edited a comment on pull request #391:
URL: https://github.com/apache/lucene/pull/391#issuecomment-946544474


   @janhoy @dweiss 
   
   Commands with `./` do not work on Windows. Yesterday I just tested it on my 
Windows OS; I saw Command Prompt and PowerShell do not support `./gradlew`, and 
Python didn't interpret them for Windows.
   
   After I noticed this comment in the file, I deleted the comment... sorry for 
the noise.
   
https://github.com/apache/lucene/blob/6c21862a552cccbb8509e4383ac8c6d10c68137f/dev-tools/scripts/smokeTestRelease.py#L43-L45
   
   I think it could be labor to make the scripts OS-agnostic. As for Windows, 
instead of fully supporting Windows perhaps we could test it on WSL2  then 
throw away Cygwin? I have little experience with it, but it seems to work just 
as plain Ubuntu and it's easier to install than Cygwin (its I/O performance was 
terrible a few years ago, but it should have improved...).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dweiss commented on pull request #391: LUCENE-9997 Second pass smoketester fixes for 9.0

2021-10-19 Thread GitBox


dweiss commented on pull request #391:
URL: https://github.com/apache/lucene/pull/391#issuecomment-946571894


   I'm really fine with these scripts working just on Unix-ish systems. If you 
really want to, WSL or a virtual machines is a fine workaround for Windows 
users (like you or me). Like I said - to many variables to consider (file 
permissions are notoriously annoying to get right).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] apanimesh061 commented on a change in pull request #362: LUCENE-9431: UnifiedHighlighter WEIGHT_MATCHES is now true by default

2021-10-19 Thread GitBox


apanimesh061 commented on a change in pull request #362:
URL: https://github.com/apache/lucene/pull/362#discussion_r731752215



##
File path: 
lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/UnifiedHighlighter.java
##
@@ -1168,9 +1174,12 @@ public CacheHelper getReaderCacheHelper() {
 
 /**
  * Internally use the {@link Weight#matches(LeafReaderContext, int)} API 
for highlighting. It's
- * more accurate to the query, though might not calculate passage 
relevancy as well. Use of this
- * flag requires {@link #MULTI_TERM_QUERY} and {@link #PHRASES}. {@link
- * #PASSAGE_RELEVANCY_OVER_SPEED} will be ignored. False by default.
+ * more accurate to the query, and the snippets can be a little different 
for phrases because
+ * the whole phrase is marked up instead of each word. The passage 
relevancy calculation can be
+ * different (maybe worse?) and it's slower when highlighting many fields. 
Use of this flag
+ * requires {@link #MULTI_TERM_QUERY} and {@link #PHRASES}. {@link
+ * #PASSAGE_RELEVANCY_OVER_SPEED} will be ignored. True by default, so 
long as the requirements

Review comment:
   I am attaching a diff file here which contains the unit test changes for 
the default behavior and the changes I mentioned in the comment above: 
[LUCENE-9431.txt](https://github.com/apache/lucene/files/7372657/LUCENE-9431.txt).
 Meanwhile I am trying to figure out how to update this current pull-request.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dnhatn commented on pull request #389: LUCENE-10159: Fix invalid access in sorted set dv

2021-10-19 Thread GitBox


dnhatn commented on pull request #389:
URL: https://github.com/apache/lucene/pull/389#issuecomment-946649980


   @rmuir @jpountz Thanks for reviewing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9613) Create blocks for ords when it helps in Lucene80DocValuesFormat

2021-10-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430493#comment-17430493
 ] 

ASF subversion and git services commented on LUCENE-9613:
-

Commit 8b68bf60c9871ecb200f64c64bf55eb6ac456c0e in lucene's branch 
refs/heads/main from Nhat Nguyen
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=8b68bf6 ]

LUCENE-10159: Fix invalid access in sorted set dv (#389)

We introduced invalid accesses for sorted set doc values in LUCENE-9613. 
However, the issue has been unnoticed because the ordinals in doc values
tests aren't complex enough to use high packed bits, and the 3 padding
bytes make these invalid accesses perfectly fine. To reproduce this
issue, we need to use at least 20 bits per value for the ordinals.

> Create blocks for ords when it helps in Lucene80DocValuesFormat
> ---
>
> Key: LUCENE-9613
> URL: https://issues.apache.org/jira/browse/LUCENE-9613
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Fix For: main (9.0)
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Currently for sorted(-set) values, we always write ords using 
> log2(valueCount) bits per entry. However in several cases like when the field 
> is used in the index sort, or if one value is _very_common, splitting into 
> blocks like we do for numerics would help.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10159) Index corruption: IndexOutOfBoundsException for doc values

2021-10-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430492#comment-17430492
 ] 

ASF subversion and git services commented on LUCENE-10159:
--

Commit 8b68bf60c9871ecb200f64c64bf55eb6ac456c0e in lucene's branch 
refs/heads/main from Nhat Nguyen
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=8b68bf6 ]

LUCENE-10159: Fix invalid access in sorted set dv (#389)

We introduced invalid accesses for sorted set doc values in LUCENE-9613. 
However, the issue has been unnoticed because the ordinals in doc values
tests aren't complex enough to use high packed bits, and the 3 padding
bytes make these invalid accesses perfectly fine. To reproduce this
issue, we need to use at least 20 bits per value for the ordinals.

> Index corruption: IndexOutOfBoundsException for doc values
> --
>
> Key: LUCENE-10159
> URL: https://issues.apache.org/jira/browse/LUCENE-10159
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Blocker
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Since we upgraded Elasticsearch to a Lucene 9 snaspshot, we have seen test 
> failures with the following stack trace. This looks like an issue with the 
> Lucene90 DocValuesFormat.
> {noformat}
> org.apache.lucene.index.MergePolicy$MergeException: 
> java.lang.IndexOutOfBoundsException
>   at 
> org.elasticsearch.index.engine.InternalEngine$EngineMergeScheduler$2.doRun(InternalEngine.java:2340)
>  ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
>   at 
> org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:737)
>  ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
>   at 
> org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
>  ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>  [?:?]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>  [?:?]
>   at java.lang.Thread.run(Thread.java:833) [?:?]
> Caused by: java.lang.IndexOutOfBoundsException
>   at java.nio.Buffer.checkIndex(Buffer.java:749) ~[?:?]
>   at java.nio.DirectByteBuffer.getInt(DirectByteBuffer.java:692) ~[?:?]
>   at 
> org.apache.lucene.store.ByteBufferGuard.getInt(ByteBufferGuard.java:128) 
> ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a 
> cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
>   at 
> org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl.readInt(ByteBufferIndexInput.java:591)
>  ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a 
> cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
>   at 
> org.apache.lucene.util.packed.DirectReader$DirectPackedReader20.get(DirectReader.java:222)
>  ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a 
> cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
>   at 
> org.apache.lucene.util.packed.DirectMonotonicReader.get(DirectMonotonicReader.java:149)
>  ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a 
> cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
>   at 
> org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$24.set(Lucene90DocValuesProducer.java:1356)
>  ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a 
> cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
>   at 
> org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$24.docValueCount(Lucene90DocValuesProducer.java:1348)
>  ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a 
> cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
>   at 
> org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$25.nextDoc(Lucene90DocValuesProducer.java:1405)
>  ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a 
> cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
>   at 
> org.apache.lucene.codecs.DocValuesConsumer.mergeSortedSetField(DocValuesConsumer.java:837)
>  ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a 
> cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
>   at 
> org.apache.lucene.codecs.DocValuesConsumer.merge(DocValuesConsumer.java:148) 
> ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a 
> cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
>   at 
> org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$Fi

[GitHub] [lucene] dnhatn merged pull request #389: LUCENE-10159: Fix invalid access in sorted set dv

2021-10-19 Thread GitBox


dnhatn merged pull request #389:
URL: https://github.com/apache/lucene/pull/389


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-10159) Index corruption: IndexOutOfBoundsException for doc values

2021-10-19 Thread Nhat Nguyen (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nhat Nguyen updated LUCENE-10159:
-
Fix Version/s: main (9.0)
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Index corruption: IndexOutOfBoundsException for doc values
> --
>
> Key: LUCENE-10159
> URL: https://issues.apache.org/jira/browse/LUCENE-10159
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Blocker
> Fix For: main (9.0)
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Since we upgraded Elasticsearch to a Lucene 9 snaspshot, we have seen test 
> failures with the following stack trace. This looks like an issue with the 
> Lucene90 DocValuesFormat.
> {noformat}
> org.apache.lucene.index.MergePolicy$MergeException: 
> java.lang.IndexOutOfBoundsException
>   at 
> org.elasticsearch.index.engine.InternalEngine$EngineMergeScheduler$2.doRun(InternalEngine.java:2340)
>  ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
>   at 
> org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:737)
>  ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
>   at 
> org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
>  ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>  [?:?]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>  [?:?]
>   at java.lang.Thread.run(Thread.java:833) [?:?]
> Caused by: java.lang.IndexOutOfBoundsException
>   at java.nio.Buffer.checkIndex(Buffer.java:749) ~[?:?]
>   at java.nio.DirectByteBuffer.getInt(DirectByteBuffer.java:692) ~[?:?]
>   at 
> org.apache.lucene.store.ByteBufferGuard.getInt(ByteBufferGuard.java:128) 
> ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a 
> cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
>   at 
> org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl.readInt(ByteBufferIndexInput.java:591)
>  ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a 
> cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
>   at 
> org.apache.lucene.util.packed.DirectReader$DirectPackedReader20.get(DirectReader.java:222)
>  ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a 
> cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
>   at 
> org.apache.lucene.util.packed.DirectMonotonicReader.get(DirectMonotonicReader.java:149)
>  ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a 
> cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
>   at 
> org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$24.set(Lucene90DocValuesProducer.java:1356)
>  ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a 
> cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
>   at 
> org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$24.docValueCount(Lucene90DocValuesProducer.java:1348)
>  ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a 
> cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
>   at 
> org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$25.nextDoc(Lucene90DocValuesProducer.java:1405)
>  ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a 
> cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
>   at 
> org.apache.lucene.codecs.DocValuesConsumer.mergeSortedSetField(DocValuesConsumer.java:837)
>  ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a 
> cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
>   at 
> org.apache.lucene.codecs.DocValuesConsumer.merge(DocValuesConsumer.java:148) 
> ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a 
> cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
>   at 
> org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsWriter.merge(PerFieldDocValuesFormat.java:154)
>  ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a 
> cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
>   at 
> org.apache.lucene.index.SegmentMerger.mergeDocValues(SegmentMerger.java:168) 
> ~[lucene-core-9.0.0-snapshot-cb366d04d4a.jar:9.0.0-snapshot-cb366d04d4a 
> cb366d04d4a611faf90b36e49aa4bd9b48a6e83d - romseygeek - 2021-10-01 09:50:47]
>   at 
> org.apache.lucene.index.SegmentMerger.lambda$merge$2(SegmentMerger.ja

[GitHub] [lucene] jpountz commented on pull request #389: LUCENE-10159: Fix invalid access in sorted set dv

2021-10-19 Thread GitBox


jpountz commented on pull request #389:
URL: https://github.com/apache/lucene/pull/389#issuecomment-946681702


   @rmuir Agreed with your thoughts. I wonder what you would think of reducing 
the padding to the strict minimum depending on the number of bits per value. 
This would make it harder to change how we read values in the future, but this 
would also make it more likely to detect out-of-bounds access in the future as 
we could also detect this on low numbers of bits per value.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on pull request #389: LUCENE-10159: Fix invalid access in sorted set dv

2021-10-19 Thread GitBox


rmuir commented on pull request #389:
URL: https://github.com/apache/lucene/pull/389#issuecomment-946683120


   +1 to that idea as a followup issue too. it is really bad that it masked the 
bug here! Thanks @dnhatn for all the debugging and the fix!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-10187) Reduce DirectWriter's padding to a minimum

2021-10-19 Thread Adrien Grand (Jira)
Adrien Grand created LUCENE-10187:
-

 Summary: Reduce DirectWriter's padding to a minimum
 Key: LUCENE-10187
 URL: https://issues.apache.org/jira/browse/LUCENE-10187
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Adrien Grand


This is a follow-up of LUCENE-10159 where DirectWriter's padding hid an 
out-of-bounds access. A consequence of DirectWriter's padding is that 
out-of-bounds access is completely silent until doc values use strictly more 
than 16 bits per ord, a situation that almost never occurs in our tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-10188) Give SortedSetDocValues a docValueCount()?

2021-10-19 Thread Adrien Grand (Jira)
Adrien Grand created LUCENE-10188:
-

 Summary: Give SortedSetDocValues a docValueCount()?
 Key: LUCENE-10188
 URL: https://issues.apache.org/jira/browse/LUCENE-10188
 Project: Lucene - Core
  Issue Type: Wish
Reporter: Adrien Grand


Theoretically SortedSetDocValues gives more options to codecs with regard to 
how SORTED_SET doc values could store ords. However in practice we currently 
always store counts. Maybe giving SORTED_SET doc values an API that is closer 
to the API of SORTED_NUMERIC doc values would be a better trade-off?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10180) Remove usage of lambdas in SegmentMerger?

2021-10-19 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430531#comment-17430531
 ] 

Adrien Grand commented on LUCENE-10180:
---

I'm not sure either, I don't think I did anything special so that this PR would 
not get linked here. Sorry [~vigyas]. If you're looking for an easy issue to 
get started, I could recommend this one: LUCENE-10084, though it's not related 
to merging.

> Remove usage of lambdas in SegmentMerger?
> -
>
> Key: LUCENE-10180
> URL: https://issues.apache.org/jira/browse/LUCENE-10180
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: profile.png
>
>
> SegmentMerger now uses lambdas to share the logic around logging merging 
> times for all file formats.
> One problem is that these lambdas get auto-generated names, and it makes it 
> harder to work with profilers since things that should logically end up in 
> the same sub tree end up in different sub trees because two instances of the 
> same lambda get different names.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jpountz merged pull request #385: LUCENE-10180: Avoid using lambdas in SegmentMerger.

2021-10-19 Thread GitBox


jpountz merged pull request #385:
URL: https://github.com/apache/lucene/pull/385


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10180) Remove usage of lambdas in SegmentMerger?

2021-10-19 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430532#comment-17430532
 ] 

Adrien Grand commented on LUCENE-10180:
---

bq. does the proposed solution (function pointers) make the profiles more 
consistent?

Yes it does, since the method ref is always given its actual name in profiles.

> Remove usage of lambdas in SegmentMerger?
> -
>
> Key: LUCENE-10180
> URL: https://issues.apache.org/jira/browse/LUCENE-10180
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: profile.png
>
>
> SegmentMerger now uses lambdas to share the logic around logging merging 
> times for all file formats.
> One problem is that these lambdas get auto-generated names, and it makes it 
> harder to work with profilers since things that should logically end up in 
> the same sub tree end up in different sub trees because two instances of the 
> same lambda get different names.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10180) Remove usage of lambdas in SegmentMerger?

2021-10-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430533#comment-17430533
 ] 

ASF subversion and git services commented on LUCENE-10180:
--

Commit 1448e4739b90613d63ac9efeea1326214b720638 in lucene's branch 
refs/heads/main from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=1448e47 ]

LUCENE-10180: Avoid using lambdas in SegmentMerger. (#385)



> Remove usage of lambdas in SegmentMerger?
> -
>
> Key: LUCENE-10180
> URL: https://issues.apache.org/jira/browse/LUCENE-10180
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: profile.png
>
>
> SegmentMerger now uses lambdas to share the logic around logging merging 
> times for all file formats.
> One problem is that these lambdas get auto-generated names, and it makes it 
> harder to work with profilers since things that should logically end up in 
> the same sub tree end up in different sub trees because two instances of the 
> same lambda get different names.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-10180) Remove usage of lambdas in SegmentMerger?

2021-10-19 Thread Adrien Grand (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand resolved LUCENE-10180.
---
Fix Version/s: main (9.0)
   Resolution: Fixed

> Remove usage of lambdas in SegmentMerger?
> -
>
> Key: LUCENE-10180
> URL: https://issues.apache.org/jira/browse/LUCENE-10180
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Adrien Grand
>Priority: Minor
> Fix For: main (9.0)
>
> Attachments: profile.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> SegmentMerger now uses lambdas to share the logic around logging merging 
> times for all file formats.
> One problem is that these lambdas get auto-generated names, and it makes it 
> harder to work with profilers since things that should logically end up in 
> the same sub tree end up in different sub trees because two instances of the 
> same lambda get different names.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jpountz opened a new pull request #398: LUCENE-10187: Reduce DirectWriter's padding.

2021-10-19 Thread GitBox


jpountz opened a new pull request #398:
URL: https://github.com/apache/lucene/pull/398


   It would make us more likely to detect out-of-bounds access in the future.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #385: LUCENE-10180: Avoid using lambdas in SegmentMerger.

2021-10-19 Thread GitBox


uschindler commented on pull request #385:
URL: https://github.com/apache/lucene/pull/385#issuecomment-946762526


   Thanks! ❤️


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10180) Remove usage of lambdas in SegmentMerger?

2021-10-19 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430575#comment-17430575
 ] 

Uwe Schindler commented on LUCENE-10180:


{quote}
bq. does the proposed solution (function pointers) make the profiles more 
consistent?

Yes it does, since the method ref is always given its actual name in profiles.
{quote}

Background for [~sokolov]: Lambdas can't be compiled to without creating a 
method out of it. So {{a -> foobar(a)}} will generate a static or virtual 
method (depending on if access to "this" is needed) named {{lambda$XY(a)}} with 
the body {{return foobar(a)}. This is of course not needed but you always see 
the lambda  method in the stack traces. So to better allow to see where 
something happens in "simple cases" (does not work in complex chains with Java 
streams): Avoid lambdas and add the bodies as methods. But always look at 
signatures and always prefer a method reference anywhere in code if a lambda 
that only calls another method with exact same parameter (with or without 
"this" capture).

> Remove usage of lambdas in SegmentMerger?
> -
>
> Key: LUCENE-10180
> URL: https://issues.apache.org/jira/browse/LUCENE-10180
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Adrien Grand
>Priority: Minor
> Fix For: main (9.0)
>
> Attachments: profile.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> SegmentMerger now uses lambdas to share the logic around logging merging 
> times for all file formats.
> One problem is that these lambdas get auto-generated names, and it makes it 
> harder to work with profilers since things that should logically end up in 
> the same sub tree end up in different sub trees because two instances of the 
> same lambda get different names.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10180) Remove usage of lambdas in SegmentMerger?

2021-10-19 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430575#comment-17430575
 ] 

Uwe Schindler edited comment on LUCENE-10180 at 10/19/21, 2:20 PM:
---

{quote}
bq. does the proposed solution (function pointers) make the profiles more 
consistent?

Yes it does, since the method ref is always given its actual name in profiles.
{quote}

Background for [~sokolov]: Lambdas can't be compiled to bytecode without 
creating a method out of it (and then make a reference to the same type of 
method reference syntax in the lambda bootstrap invokedynamic). So {{a -> 
foobar(a)}} will generate a static or virtual method (depending on if access to 
"this" is needed) named {{lambda$XY(a)}} with the body {{return foobar(a)}. 
This is of course not needed but you always see the lambda  method in the stack 
traces. So to better allow to see where something happens in "simple cases" 
(does not work in complex chains with Java streams): Avoid lambdas and add the 
bodies as methods. But always look at signatures and always prefer a method 
reference anywhere in code if a lambda that only calls another method with 
exact same parameter (with or without "this" capture).


was (Author: thetaphi):
{quote}
bq. does the proposed solution (function pointers) make the profiles more 
consistent?

Yes it does, since the method ref is always given its actual name in profiles.
{quote}

Background for [~sokolov]: Lambdas can't be compiled to without creating a 
method out of it. So {{a -> foobar(a)}} will generate a static or virtual 
method (depending on if access to "this" is needed) named {{lambda$XY(a)}} with 
the body {{return foobar(a)}. This is of course not needed but you always see 
the lambda  method in the stack traces. So to better allow to see where 
something happens in "simple cases" (does not work in complex chains with Java 
streams): Avoid lambdas and add the bodies as methods. But always look at 
signatures and always prefer a method reference anywhere in code if a lambda 
that only calls another method with exact same parameter (with or without 
"this" capture).

> Remove usage of lambdas in SegmentMerger?
> -
>
> Key: LUCENE-10180
> URL: https://issues.apache.org/jira/browse/LUCENE-10180
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Adrien Grand
>Priority: Minor
> Fix For: main (9.0)
>
> Attachments: profile.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> SegmentMerger now uses lambdas to share the logic around logging merging 
> times for all file formats.
> One problem is that these lambdas get auto-generated names, and it makes it 
> harder to work with profilers since things that should logically end up in 
> the same sub tree end up in different sub trees because two instances of the 
> same lambda get different names.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10180) Remove usage of lambdas in SegmentMerger?

2021-10-19 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430575#comment-17430575
 ] 

Uwe Schindler edited comment on LUCENE-10180 at 10/19/21, 2:21 PM:
---

{quote}
bq. does the proposed solution (function pointers) make the profiles more 
consistent?

Yes it does, since the method ref is always given its actual name in profiles.
{quote}

Background for [~sokolov]: Lambdas can't be compiled to bytecode without 
creating a method out of it (and then make a reference to the same type of 
method reference syntax in the lambda bootstrap invokedynamic). So {{a -> 
foobar(a)}} will generate a static or virtual method (depending on if access to 
"this" is needed) named {{lambda$XY(a)}} with the body {{return foobar(a)}}. 
This is of course not needed but you always see the lambda  method in the stack 
traces. So to better allow to see where something happens in "simple cases" 
(does not work in complex chains with Java streams): Avoid lambdas and add the 
bodies as methods. But always look at signatures and always prefer a method 
reference anywhere in code if a lambda that only calls another method with 
exact same parameter (with or without "this" capture).


was (Author: thetaphi):
{quote}
bq. does the proposed solution (function pointers) make the profiles more 
consistent?

Yes it does, since the method ref is always given its actual name in profiles.
{quote}

Background for [~sokolov]: Lambdas can't be compiled to bytecode without 
creating a method out of it (and then make a reference to the same type of 
method reference syntax in the lambda bootstrap invokedynamic). So {{a -> 
foobar(a)}} will generate a static or virtual method (depending on if access to 
"this" is needed) named {{lambda$XY(a)}} with the body {{return foobar(a)}. 
This is of course not needed but you always see the lambda  method in the stack 
traces. So to better allow to see where something happens in "simple cases" 
(does not work in complex chains with Java streams): Avoid lambdas and add the 
bodies as methods. But always look at signatures and always prefer a method 
reference anywhere in code if a lambda that only calls another method with 
exact same parameter (with or without "this" capture).

> Remove usage of lambdas in SegmentMerger?
> -
>
> Key: LUCENE-10180
> URL: https://issues.apache.org/jira/browse/LUCENE-10180
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Adrien Grand
>Priority: Minor
> Fix For: main (9.0)
>
> Attachments: profile.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> SegmentMerger now uses lambdas to share the logic around logging merging 
> times for all file formats.
> One problem is that these lambdas get auto-generated names, and it makes it 
> harder to work with profilers since things that should logically end up in 
> the same sub tree end up in different sub trees because two instances of the 
> same lambda get different names.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler merged pull request #397: LUCENE-10182: Order assertion parameters correctly

2021-10-19 Thread GitBox


uschindler merged pull request #397:
URL: https://github.com/apache/lucene/pull/397


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10182) TestRamUsageEstimator asserts trivial equality

2021-10-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430581#comment-17430581
 ] 

ASF subversion and git services commented on LUCENE-10182:
--

Commit 54c5a2ce28d35c3ff9eb98aa83a69ca6d0f69134 in lucene's branch 
refs/heads/main from Stefan Vodita
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=54c5a2c ]

LUCENE-10182: Order assertion parameters correctly (#397)



> TestRamUsageEstimator asserts trivial equality
> --
>
> Key: LUCENE-10182
> URL: https://issues.apache.org/jira/browse/LUCENE-10182
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Stefan Vodita
>Assignee: Uwe Schindler
>Priority: Major
> Fix For: main (9.0)
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {{TestRamUsageEstimator.testStaticOverloads}} has serveral lines like:
> {code:java}
> assertEquals(sizeOf(array), sizeOf((Object) array));
> {code}
> Both calls to {{sizeOf()}} fall back on {{RamUsageTester.sizeOf}}, making the 
> 2 calls identical. Instead, we would want one of the calls to go to 
> {{RamUsageEstimator.sizeOf}}.
>  
> This issue came up while working on LUCENE-10129. A possible solution, as per 
> [~uschindler]'s suggestion, would be to remove the static import
> {code:java}
> import static org.apache.lucene.util.RamUsageTester.sizeOf;
> {code}
> Instead, we could be explicit on which method we are calling, like:
> {code:java}
> assertEquals(RamUsageEstimator.sizeOf(array), RamUsageTester.sizeOf(array));
> {code}
> This could be replicated for other potentially confusing cases in the test 
> class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10182) TestRamUsageEstimator asserts trivial equality

2021-10-19 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430582#comment-17430582
 ] 

Uwe Schindler commented on LUCENE-10182:


Merged!

> TestRamUsageEstimator asserts trivial equality
> --
>
> Key: LUCENE-10182
> URL: https://issues.apache.org/jira/browse/LUCENE-10182
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Stefan Vodita
>Assignee: Uwe Schindler
>Priority: Major
> Fix For: main (9.0)
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {{TestRamUsageEstimator.testStaticOverloads}} has serveral lines like:
> {code:java}
> assertEquals(sizeOf(array), sizeOf((Object) array));
> {code}
> Both calls to {{sizeOf()}} fall back on {{RamUsageTester.sizeOf}}, making the 
> 2 calls identical. Instead, we would want one of the calls to go to 
> {{RamUsageEstimator.sizeOf}}.
>  
> This issue came up while working on LUCENE-10129. A possible solution, as per 
> [~uschindler]'s suggestion, would be to remove the static import
> {code:java}
> import static org.apache.lucene.util.RamUsageTester.sizeOf;
> {code}
> Instead, we could be explicit on which method we are calling, like:
> {code:java}
> assertEquals(RamUsageEstimator.sizeOf(array), RamUsageTester.sizeOf(array));
> {code}
> This could be replicated for other potentially confusing cases in the test 
> class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-10189) Optimize SortedSet/SortedNumeric doc values writers for fields that are effectively single-valued

2021-10-19 Thread Adrien Grand (Jira)
Adrien Grand created LUCENE-10189:
-

 Summary: Optimize SortedSet/SortedNumeric doc values writers for 
fields that are effectively single-valued
 Key: LUCENE-10189
 URL: https://issues.apache.org/jira/browse/LUCENE-10189
 Project: Lucene - Core
  Issue Type: Wish
Reporter: Adrien Grand


I was wondering how much overhead multi-valued doc-value types have over their 
single-valued counterparts, so I hacked IndexTaxis to index all doc-value 
fields via Sorted(Set|Numeric)DocValuesField instead of 
(Sorted|Numeric)DocValuesField and flush times increased by 30%. It should be 
easy to automatically detect such cases in the doc values writers?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-10190) Assertion error in TestIndexWriter.testMaxCompletedSequenceNumber

2021-10-19 Thread Dawid Weiss (Jira)
Dawid Weiss created LUCENE-10190:


 Summary: Assertion error in 
TestIndexWriter.testMaxCompletedSequenceNumber
 Key: LUCENE-10190
 URL: https://issues.apache.org/jira/browse/LUCENE-10190
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Dawid Weiss


CI failure in PR at:
https://github.com/apache/lucene/pull/396/checks?check_run_id=3936559246

Does not reproduce. Stack below.

{code}
org.apache.lucene.index.TestIndexWriter > testMaxCompletedSequenceNumber FAILED
com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an 
uncaught exception in thread: Thread[id=1840, name=Thread-1481, state=RUNNABLE, 
group=TGRP-TestIndexWriter]
at 
__randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C:671A014F243A0EDE]:0)

Caused by:
java.lang.AssertionError: expected:<1> but was:<0>
at __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C]:0)
at org.junit.Assert.fail(Assert.java:89)
at org.junit.Assert.failNotEquals(Assert.java:835)
at org.junit.Assert.assertEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:633)
at 
org.apache.lucene.index.TestIndexWriter.lambda$testMaxCompletedSequenceNumber$53(TestIndexWriter.java:4305)
at java.base/java.lang.Thread.run(Thread.java:829)

org.apache.lucene.index.TestIndexWriter > test suite's output saved to 
/home/runner/work/lucene/lucene/lucene/core/build/test-results/test/outputs/OUTPUT-org.apache.lucene.index.TestIndexWriter.txt,
 copied below:
  2> أكتوبر ١٩, ٢٠٢١ ١٠:٢٧:٢٧ ص 
com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler
 uncaughtException
  2> WARNING: Uncaught exception in thread: 
Thread[Thread-1481,5,TGRP-TestIndexWriter]
  2> java.lang.AssertionError: expected:<1> but was:<0>
  2>at __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C]:0)
  2>at org.junit.Assert.fail(Assert.java:89)
  2>at org.junit.Assert.failNotEquals(Assert.java:835)
  2>at org.junit.Assert.assertEquals(Assert.java:647)
  2>at org.junit.Assert.assertEquals(Assert.java:633)
  2>at 
org.apache.lucene.index.TestIndexWriter.lambda$testMaxCompletedSequenceNumber$53(TestIndexWriter.java:4305)
  2>at java.base/java.lang.Thread.run(Thread.java:829)
  2> 
   > com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an 
uncaught exception in thread: Thread[id=1840, name=Thread-1481, state=RUNNABLE, 
group=TGRP-TestIndexWriter]
   > at 
__randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C:671A014F243A0EDE]:0)
   > 
   > Caused by:
   > java.lang.AssertionError: expected:<1> but was:<0>
   > at __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C]:0)
   > at org.junit.Assert.fail(Assert.java:89)
   > at org.junit.Assert.failNotEquals(Assert.java:835)
   > at org.junit.Assert.assertEquals(Assert.java:647)
   > at org.junit.Assert.assertEquals(Assert.java:633)
   > at 
org.apache.lucene.index.TestIndexWriter.lambda$testMaxCompletedSequenceNumber$53(TestIndexWriter.java:4305)
   > at java.base/java.lang.Thread.run(Thread.java:829)
  2> NOTE: reproduce with: gradlew test --tests 
TestIndexWriter.testMaxCompletedSequenceNumber -Dtests.seed=5B8EFE8DBEFB881C 
-Dtests.badapples=true -Dtests.locale=ar-KW -Dtests.timezone=Africa/Lusaka 
-Dtests.asserts=true -Dtests.file.encoding=UTF-8
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dweiss commented on pull request #396: LUCENE-10174 BuildAndPushRelease additional improvements

2021-10-19 Thread GitBox


dweiss commented on pull request #396:
URL: https://github.com/apache/lucene/pull/396#issuecomment-946803083


   CI failed with LUCENE-10190. I respinned.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dweiss edited a comment on pull request #396: LUCENE-10174 BuildAndPushRelease additional improvements

2021-10-19 Thread GitBox


dweiss edited a comment on pull request #396:
URL: https://github.com/apache/lucene/pull/396#issuecomment-946803083


   CI failed with LUCENE-10190. I respun.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] janhoy commented on pull request #396: LUCENE-10174 BuildAndPushRelease additional improvements

2021-10-19 Thread GitBox


janhoy commented on pull request #396:
URL: https://github.com/apache/lucene/pull/396#issuecomment-946830386


   > LGTM. I also think all gradle invocations from within the python script 
shouldn't fork the daemon (--no-daemon) - this prevents leaking memory and 
makes sure nothing is left behind in case of errors.
   
   Are you saying that `--no-daemon` would be better? I could fold that into 
this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dweiss commented on pull request #396: LUCENE-10174 BuildAndPushRelease additional improvements

2021-10-19 Thread GitBox


dweiss commented on pull request #396:
URL: https://github.com/apache/lucene/pull/396#issuecomment-946910151


   Yes, I think --no-daemon would be helpful here (in addition to worker 
restriction). This leaves nothing behind - clean slate for a re-run.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10189) Optimize SortedSet/SortedNumeric doc values writers for fields that are effectively single-valued

2021-10-19 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430647#comment-17430647
 ] 

Robert Muir commented on LUCENE-10189:
--

For the SortedSetCase, seems like we want to fix the IW component to use 
{{DocValues.singleton}}, box it up, and return it from {{SortedSetDocValues 
getDocValues()}} ?

Then the DocValuesConsumer can simply check with {{DocValues.unwrapSingleton}}. 
The same codepath can work for merge and flush.

> Optimize SortedSet/SortedNumeric doc values writers for fields that are 
> effectively single-valued
> -
>
> Key: LUCENE-10189
> URL: https://issues.apache.org/jira/browse/LUCENE-10189
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Adrien Grand
>Priority: Minor
>
> I was wondering how much overhead multi-valued doc-value types have over 
> their single-valued counterparts, so I hacked IndexTaxis to index all 
> doc-value fields via Sorted(Set|Numeric)DocValuesField instead of 
> (Sorted|Numeric)DocValuesField and flush times increased by 30%. It should be 
> easy to automatically detect such cases in the doc values writers?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10147) KnnVectorQuery can produce negative scores

2021-10-19 Thread Julie Tibshirani (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430662#comment-17430662
 ] 

Julie Tibshirani commented on LUCENE-10147:
---

[~msoko...@gmail.com] you mentioned that we discussed enforcing that vectors 
are unit length when using {{VectorSimilarityFunction#DOT_PRODUCT}}. I'm 
wondering why we decided not to go that direction (I couldn't find the 
discussion in JIRA/ GitHub)? This is just for my context, I don't have strong 
feelings about the decision.

> KnnVectorQuery can produce negative scores
> --
>
> Key: LUCENE-10147
> URL: https://issues.apache.org/jira/browse/LUCENE-10147
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Julie Tibshirani
>Priority: Blocker
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> The cosine similarity of two vectors falls in the range [-1, 1]. So currently 
> with cosine similarity, {{KnnVectorQuery}} can produce negative scores. Maybe 
> we should just adjust the scores in this case by adding 1, shifting them to 
> the range [0, 2].
> As a side note, this made me notice that 
> {{VectorSimilarityFunction.DOT_PRODUCT}} is really quite "expert"! Users need 
> to know to normalize all document and query vectors to unit length when using 
> this similarity. Otherwise the output is unbounded and difficult to handle in 
> scoring. Also dot product is not a true metric: for example, it doesn't obey 
> the triangle inequality. So many ANN algorithms have trouble supporting it. 
> As part of this issue, we could improve the documentation on 
> {{VectorSimilarityFunction.DOT_PRODUCT}} to clarify that normalization is 
> required.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] janhoy merged pull request #396: LUCENE-10174 BuildAndPushRelease additional improvements

2021-10-19 Thread GitBox


janhoy merged pull request #396:
URL: https://github.com/apache/lucene/pull/396


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10174) Update buildAndPushRelease.py for new gradle build

2021-10-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430670#comment-17430670
 ] 

ASF subversion and git services commented on LUCENE-10174:
--

Commit f5486d13e6f440a7296c23f45cd53f0313e83e0e in lucene's branch 
refs/heads/main from Jan Høydahl
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=f5486d1 ]

LUCENE-10174 BuildAndPushRelease additional improvements (#396)



> Update buildAndPushRelease.py for new gradle build
> --
>
> Key: LUCENE-10174
> URL: https://issues.apache.org/jira/browse/LUCENE-10174
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
> Fix For: main (9.0)
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> With LUCENE-9488 and LUCENE-10173 the gradle build was polished to properly 
> build source and binary artifacts, and sign those using either gpg tool or a 
> built-in java-based signing plugin. See 
> [https://github.com/apache/lucene/blob/main/help/publishing.txt]
> This jira will update {{buildAndPushRelease.py}} script to use the correct 
> build parameters. It will also add cmdline args to choose between gpg and 
> built-in (gpg default), and to supply the location of {{gpgHome}} if you do 
> not use gpg. We'll also add an option to NOT prompt for passphrase in the 
> python script, which will fallback to defaults (gpg-agent, env.vars or 
> gradle.properties).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10189) Optimize SortedSet/SortedNumeric doc values writers for fields that are effectively single-valued

2021-10-19 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430710#comment-17430710
 ] 

Adrien Grand commented on LUCENE-10189:
---

Right, I tried to do this in the linked PR (for some reason it wasn't linked 
automatically, I just did it manually).

> Optimize SortedSet/SortedNumeric doc values writers for fields that are 
> effectively single-valued
> -
>
> Key: LUCENE-10189
> URL: https://issues.apache.org/jira/browse/LUCENE-10189
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Adrien Grand
>Priority: Minor
>
> I was wondering how much overhead multi-valued doc-value types have over 
> their single-valued counterparts, so I hacked IndexTaxis to index all 
> doc-value fields via Sorted(Set|Numeric)DocValuesField instead of 
> (Sorted|Numeric)DocValuesField and flush times increased by 30%. It should be 
> easy to automatically detect such cases in the doc values writers?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Closed] (LUCENE-10126) CompetitiveIterator of NumericComparator can wrongly skip documents

2021-10-19 Thread Mayya Sharipova (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayya Sharipova closed LUCENE-10126.


Closing after the 8.10.1 release

> CompetitiveIterator of NumericComparator can wrongly skip documents
> ---
>
> Key: LUCENE-10126
> URL: https://issues.apache.org/jira/browse/LUCENE-10126
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 8.9, 8.10
>Reporter: Nhat Nguyen
>Priority: Major
> Fix For: 8.11, 8.10.1, 9.10
>
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> The ML team at Elastic reported that a large scroll with an Elasticsearch 
> nightly build that uses Lucene 9.0 snapshot returns fewer documents than 
> expected. I looked into it and found that the competitive iterator can 
> wrongly skip docs with a chunked bulk scorer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Closed] (LUCENE-10119) singleSort should not be set when after is non-null

2021-10-19 Thread Mayya Sharipova (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayya Sharipova closed LUCENE-10119.


Closing after the 8.10.1 release

> singleSort should not be set when after is non-null
> ---
>
> Key: LUCENE-10119
> URL: https://issues.apache.org/jira/browse/LUCENE-10119
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: main (9.0), 8.10
>Reporter: Nhat Nguyen
>Assignee: Nhat Nguyen
>Priority: Major
> Fix For: main (9.0), 8.11, 8.10.1
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Today we set the parameter `singleSort` to true when we have a single 
> comparator to skip documents whose values equal the last visited value. 
> However, this is incorrect when the search_after parameter is non-null as 
> that we will skip documents whose values are equal, but their docIDs are 
> greater than the docID of the `search_after` parameter.
>  
> We found this issue in Elasticsearch after upgrading it to Lucene 8.10 and 
> Lucene 9.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Closed] (LUCENE-10110) MultiCollector should conditionally wrap single leaf collector

2021-10-19 Thread Mayya Sharipova (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayya Sharipova closed LUCENE-10110.


Closing after the 8.10.1 release

> MultiCollector should conditionally wrap single leaf collector
> --
>
> Key: LUCENE-10110
> URL: https://issues.apache.org/jira/browse/LUCENE-10110
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Jim Ferenczi
>Priority: Minor
> Fix For: main (9.0), 8.11, 8.10.1
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> MultiCollector adapts the score mode of multiple collectors so that they can 
> run together in a search. If a collector wants to skip low-scoring hits, this 
> adapter ensures that the other collectors still see all hits. Although, when 
> all these collectors have early terminated, we allow the skipping collector 
> to start propagating the minimum score. This is not valid because the weight 
> of the query is built from the combined score mode of all collectors at the 
> beginning of the search.
> So we should always ignore the minimum score in MultiCollector if the 
> combined score mode is different than TOP_SCORES.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9997) Revisit smoketester for 9.0 build

2021-10-19 Thread Jira


[ 
https://issues.apache.org/jira/browse/LUCENE-9997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430727#comment-17430727
 ] 

Jan Høydahl commented on LUCENE-9997:
-

Yey, first smoketest SUCCESS on freshly built lucene release artifacts on [PR 
391|https://github.com/apache/lucene/pull/391].
{code:java}
...
verify maven artifact sigs 
.
unpack lucene-9.0.0.tgz...
verify that Maven artifacts are same as in the binary distribution...
verify JAR metadata/identity/no javax.* or java.* classes...

SUCCESS! [0:09:23.758716]{code}

> Revisit smoketester for 9.0 build
> -
>
> Key: LUCENE-9997
> URL: https://issues.apache.org/jira/browse/LUCENE-9997
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Robert Muir
>Priority: Major
> Attachments: image-2021-10-12-12-47-11-480.png, 
> image-2021-10-12-12-48-15-373.png
>
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>
> Currently we have a (great) {{dev-tools/scripts/smokeTester.py}} that will 
> perform automated tests against a release.
> This was developed with the ant build process in mind.
> This issue is just about considering the automated checks we do here, maybe 
> some of them can be done efficiently in the gradle build in earlier places: 
> this would be a large improvement!
> Obviously some of them (e.g. GPG release key verifications) are really 
> specific to the artifacts in question. These are most important to release 
> verification, as that is actually the only place we can check it.
> Any other checks (and I do tend to think, this checker should try to be 
> thorough, invoking gradle etc), should be stuff we regularly test in 
> PRs/nightly/builds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] janhoy commented on pull request #391: LUCENE-9997 Second pass smoketester fixes for 9.0

2021-10-19 Thread GitBox


janhoy commented on pull request #391:
URL: https://github.com/apache/lucene/pull/391#issuecomment-947035468


   ```
   verify maven artifact sigs 

   unpack lucene-9.0.0.tgz...
   verify that Maven artifacts are same as in the binary distribution...
   verify JAR metadata/identity/no javax.* or java.* classes...
   
   SUCCESS! [0:09:23.758716]
   ```
   
   I also added the `--no-daemon` arg to all gradlew commands here.
   I'll merge this in now. Then feel free to open further PRs against 
LUCENE-9997 for smoke tester improvements, such as WSL support etc.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] janhoy merged pull request #391: LUCENE-9997 Second pass smoketester fixes for 9.0

2021-10-19 Thread GitBox


janhoy merged pull request #391:
URL: https://github.com/apache/lucene/pull/391


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9997) Revisit smoketester for 9.0 build

2021-10-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430730#comment-17430730
 ] 

ASF subversion and git services commented on LUCENE-9997:
-

Commit c77e9ddf93ae872ba6556d39c48a0a32e31e91b1 in lucene's branch 
refs/heads/main from Jan Høydahl
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=c77e9dd ]

LUCENE-9997 Second pass smoketester fixes for 9.0 (#391)

* Java17 fixes

* Add to error message that the unexpected file is in lucene/ folder

* Fix gpg command utf-8 output

* Add --no-daemon to all gradle calls, and skip clean

Co-authored-by: Dawid Weiss 
Co-Authored-by: Tomoko Uchida 

> Revisit smoketester for 9.0 build
> -
>
> Key: LUCENE-9997
> URL: https://issues.apache.org/jira/browse/LUCENE-9997
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Robert Muir
>Priority: Major
> Attachments: image-2021-10-12-12-47-11-480.png, 
> image-2021-10-12-12-48-15-373.png
>
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> Currently we have a (great) {{dev-tools/scripts/smokeTester.py}} that will 
> perform automated tests against a release.
> This was developed with the ant build process in mind.
> This issue is just about considering the automated checks we do here, maybe 
> some of them can be done efficiently in the gradle build in earlier places: 
> this would be a large improvement!
> Obviously some of them (e.g. GPG release key verifications) are really 
> specific to the artifacts in question. These are most important to release 
> verification, as that is actually the only place we can check it.
> Any other checks (and I do tend to think, this checker should try to be 
> thorough, invoking gradle etc), should be stuff we regularly test in 
> PRs/nightly/builds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-9997) Revisit smoketester for 9.0 build

2021-10-19 Thread Jira


 [ 
https://issues.apache.org/jira/browse/LUCENE-9997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl reassigned LUCENE-9997:
---

Assignee: Jan Høydahl

> Revisit smoketester for 9.0 build
> -
>
> Key: LUCENE-9997
> URL: https://issues.apache.org/jira/browse/LUCENE-9997
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Robert Muir
>Assignee: Jan Høydahl
>Priority: Major
> Attachments: image-2021-10-12-12-47-11-480.png, 
> image-2021-10-12-12-48-15-373.png
>
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> Currently we have a (great) {{dev-tools/scripts/smokeTester.py}} that will 
> perform automated tests against a release.
> This was developed with the ant build process in mind.
> This issue is just about considering the automated checks we do here, maybe 
> some of them can be done efficiently in the gradle build in earlier places: 
> this would be a large improvement!
> Obviously some of them (e.g. GPG release key verifications) are really 
> specific to the artifacts in question. These are most important to release 
> verification, as that is actually the only place we can check it.
> Any other checks (and I do tend to think, this checker should try to be 
> thorough, invoking gradle etc), should be stuff we regularly test in 
> PRs/nightly/builds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-9997) Revisit smoketester for 9.0 build

2021-10-19 Thread Jira


 [ 
https://issues.apache.org/jira/browse/LUCENE-9997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl resolved LUCENE-9997.
-
Fix Version/s: main (9.0)
   Resolution: Fixed

> Revisit smoketester for 9.0 build
> -
>
> Key: LUCENE-9997
> URL: https://issues.apache.org/jira/browse/LUCENE-9997
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Robert Muir
>Assignee: Jan Høydahl
>Priority: Major
> Fix For: main (9.0)
>
> Attachments: image-2021-10-12-12-47-11-480.png, 
> image-2021-10-12-12-48-15-373.png
>
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> Currently we have a (great) {{dev-tools/scripts/smokeTester.py}} that will 
> perform automated tests against a release.
> This was developed with the ant build process in mind.
> This issue is just about considering the automated checks we do here, maybe 
> some of them can be done efficiently in the gradle build in earlier places: 
> this would be a large improvement!
> Obviously some of them (e.g. GPG release key verifications) are really 
> specific to the artifacts in question. These are most important to release 
> verification, as that is actually the only place we can check it.
> Any other checks (and I do tend to think, this checker should try to be 
> thorough, invoking gradle etc), should be stuff we regularly test in 
> PRs/nightly/builds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-10191) Optimize vector functions by precomputing magnitudes

2021-10-19 Thread Julie Tibshirani (Jira)
Julie Tibshirani created LUCENE-10191:
-

 Summary: Optimize vector functions by precomputing magnitudes
 Key: LUCENE-10191
 URL: https://issues.apache.org/jira/browse/LUCENE-10191
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Julie Tibshirani


Both euclidean distance (L2 norm) and cosine similarity can be expressed in 
terms of dot product and vector magnitudes:
 * l2_norm(a, b) = ||a - b|| = sqrt(||a||^2 - 2(a . b) + ||b||^2)
 * cosine(a, b) = a . b / ||a|| ||b||

We could compute and store each vector's magnitude upfront while indexing, and 
compute the query vector's magnitude once per query. Then we'd calculate the 
distance using our (very optimized) dot product method, plus the precomputed 
values.

This is an exploratory issue: I haven't tested this out yet, so I'm not sure 
how much it would help. I would at least expect it to help with cosine 
similarity – several months ago we tried out similar ideas in Elasticsearch and 
were able to get a nice boost in cosine performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-10141) Update releaseWizard for 8x to correctly create back-compat indices and update Version in main after repo split

2021-10-19 Thread Mayya Sharipova (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayya Sharipova updated LUCENE-10141:
-
Fix Version/s: (was: 8.10.1)

> Update releaseWizard for 8x to correctly create back-compat indices and 
> update Version in main after repo split
> ---
>
> Key: LUCENE-10141
> URL: https://issues.apache.org/jira/browse/LUCENE-10141
> Project: Lucene - Core
>  Issue Type: Task
>  Components: release wizard
>Reporter: Timothy Potter
>Assignee: Timothy Potter
>Priority: Major
> Fix For: 8.11
>
>
> Need to update the release wizard in 8x to create the back-compat indices and 
> update the Version info so that issues like: 
> https://issues.apache.org/jira/browse/LUCENE-10131 don't impact future 8x 
> release managers. Hopefully an 8.11 is NOT needed but release managers have 
> enough on their plate to get right that we should fix this if possible. If 
> not, we at least need to document the process of doing it manually. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10189) Optimize SortedSet/SortedNumeric doc values writers for fields that are effectively single-valued

2021-10-19 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430736#comment-17430736
 ] 

Adrien Grand commented on LUCENE-10189:
---

With the linked PR I'm getting the same flush times for single-valued fields 
and multi-valued fields that are single-valued (though it doesn't mean that 
indexing is as fast as the in-memory buffering might still have some more 
overhead in the multi-valued case).

> Optimize SortedSet/SortedNumeric doc values writers for fields that are 
> effectively single-valued
> -
>
> Key: LUCENE-10189
> URL: https://issues.apache.org/jira/browse/LUCENE-10189
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Adrien Grand
>Priority: Minor
>
> I was wondering how much overhead multi-valued doc-value types have over 
> their single-valued counterparts, so I hacked IndexTaxis to index all 
> doc-value fields via Sorted(Set|Numeric)DocValuesField instead of 
> (Sorted|Numeric)DocValuesField and flush times increased by 30%. It should be 
> easy to automatically detect such cases in the doc values writers?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jtibshirani opened a new pull request #400: LUCENE-10146: Add note that dot product is preferred over cosine

2021-10-19 Thread GitBox


jtibshirani opened a new pull request #400:
URL: https://github.com/apache/lucene/pull/400


   While VectorSimilarityFunction#COSINE is helpful when you need to preserve 
the
   original vectors, it is significantly slower than DOT_PRODUCT. This commit 
adds
   javadocs to COSINE explaining that dot product is the fastest option.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jtibshirani commented on pull request #366: LUCENE-10146: Add VectorSimilarityFunction.COSINE

2021-10-19 Thread GitBox


jtibshirani commented on pull request #366:
URL: https://github.com/apache/lucene/pull/366#issuecomment-947053287


   @msokolov @mayya-sharipova following up: I ran benchmarks and it's indeed 
significantly slower (around 20% on some of the datasets we've been using). 
Here's what I've done:
   * Added a note to `VectorSimilarityFunction#COSINE` explaining that 
`DOT_PRODUCT` is the preferred option when you don't need to preserve the 
original vectors: https://github.com/apache/lucene/pull/400
   * Opened https://issues.apache.org/jira/browse/LUCENE-10191 with ideas to 
speed up cosine similarity
   
   Happy to hear other feedback/ ideas!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10084) Rewrite DocValuesFieldExistsQuery to a MatchAllDocsQuery when docCount == maxDoc

2021-10-19 Thread Vigya Sharma (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430785#comment-17430785
 ] 

Vigya Sharma commented on LUCENE-10084:
---

I would like to work on this.

> Rewrite DocValuesFieldExistsQuery to a MatchAllDocsQuery when docCount == 
> maxDoc
> 
>
> Key: LUCENE-10084
> URL: https://issues.apache.org/jira/browse/LUCENE-10084
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
>
> Now that we require all documents to use the same features (LUCENE-9334) we 
> could rewrite DocValuesFieldExistsQuery to a MatchAllDocsQuery whenever terms 
> or points have a docCount that is equal to maxDoc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] mocobeta commented on pull request #394: LUCENE-9997: write release revision to system temp dir

2021-10-19 Thread GitBox


mocobeta commented on pull request #394:
URL: https://github.com/apache/lucene/pull/394#issuecomment-947141245


   The `rev.txt` file is used to reuse the git revision on the previous run 
when `--no-prepare` option is passed.
   
https://github.com/apache/lucene/blob/c77e9ddf93ae872ba6556d39c48a0a32e31e91b1/dev-tools/scripts/buildAndPushRelease.py#L398-L402
   
   I'm not sure what is the use-cases of this, but if it's needed (for 
convenience?) we need an explicitly fixed path and shouldn't clean up the file 
after the first run. It's actually not a "temporary" file...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] mocobeta merged pull request #394: LUCENE-9997: write release revision to system temp dir

2021-10-19 Thread GitBox


mocobeta merged pull request #394:
URL: https://github.com/apache/lucene/pull/394


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9997) Revisit smoketester for 9.0 build

2021-10-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430792#comment-17430792
 ] 

ASF subversion and git services commented on LUCENE-9997:
-

Commit 54418cef450afa8a2e45904f68c6db45e241c584 in lucene's branch 
refs/heads/main from Tomoko Uchida
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=54418ce ]

LUCENE-9997: write release revision to system temp dir (#394)



> Revisit smoketester for 9.0 build
> -
>
> Key: LUCENE-9997
> URL: https://issues.apache.org/jira/browse/LUCENE-9997
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Robert Muir
>Assignee: Jan Høydahl
>Priority: Major
> Fix For: main (9.0)
>
> Attachments: image-2021-10-12-12-47-11-480.png, 
> image-2021-10-12-12-48-15-373.png
>
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> Currently we have a (great) {{dev-tools/scripts/smokeTester.py}} that will 
> perform automated tests against a release.
> This was developed with the ant build process in mind.
> This issue is just about considering the automated checks we do here, maybe 
> some of them can be done efficiently in the gradle build in earlier places: 
> this would be a large improvement!
> Obviously some of them (e.g. GPG release key verifications) are really 
> specific to the artifacts in question. These are most important to release 
> verification, as that is actually the only place we can check it.
> Any other checks (and I do tend to think, this checker should try to be 
> thorough, invoking gradle etc), should be stuff we regularly test in 
> PRs/nightly/builds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9997) Revisit smoketester for 9.0 build

2021-10-19 Thread Jira


[ 
https://issues.apache.org/jira/browse/LUCENE-9997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430795#comment-17430795
 ] 

Jan Høydahl commented on LUCENE-9997:
-

> r-- permissions on all maven artifact files

I have noticed that too. It is done in 
[https://github.com/apache/lucene/blob/main/dev-tools/scripts/buildAndPushRelease.py#L234]
 but I don't know why

> Revisit smoketester for 9.0 build
> -
>
> Key: LUCENE-9997
> URL: https://issues.apache.org/jira/browse/LUCENE-9997
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Robert Muir
>Assignee: Jan Høydahl
>Priority: Major
> Fix For: main (9.0)
>
> Attachments: image-2021-10-12-12-47-11-480.png, 
> image-2021-10-12-12-48-15-373.png
>
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> Currently we have a (great) {{dev-tools/scripts/smokeTester.py}} that will 
> perform automated tests against a release.
> This was developed with the ant build process in mind.
> This issue is just about considering the automated checks we do here, maybe 
> some of them can be done efficiently in the gradle build in earlier places: 
> this would be a large improvement!
> Obviously some of them (e.g. GPG release key verifications) are really 
> specific to the artifacts in question. These are most important to release 
> verification, as that is actually the only place we can check it.
> Any other checks (and I do tend to think, this checker should try to be 
> thorough, invoking gradle etc), should be stuff we regularly test in 
> PRs/nightly/builds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dsmiley commented on a change in pull request #362: LUCENE-9431: UnifiedHighlighter WEIGHT_MATCHES is now true by default

2021-10-19 Thread GitBox


dsmiley commented on a change in pull request #362:
URL: https://github.com/apache/lucene/pull/362#discussion_r732287420



##
File path: 
lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/UnifiedHighlighter.java
##
@@ -1168,9 +1174,12 @@ public CacheHelper getReaderCacheHelper() {
 
 /**
  * Internally use the {@link Weight#matches(LeafReaderContext, int)} API 
for highlighting. It's
- * more accurate to the query, though might not calculate passage 
relevancy as well. Use of this
- * flag requires {@link #MULTI_TERM_QUERY} and {@link #PHRASES}. {@link
- * #PASSAGE_RELEVANCY_OVER_SPEED} will be ignored. False by default.
+ * more accurate to the query, and the snippets can be a little different 
for phrases because
+ * the whole phrase is marked up instead of each word. The passage 
relevancy calculation can be
+ * different (maybe worse?) and it's slower when highlighting many fields. 
Use of this flag
+ * requires {@link #MULTI_TERM_QUERY} and {@link #PHRASES}. {@link
+ * #PASSAGE_RELEVANCY_OVER_SPEED} will be ignored. True by default, so 
long as the requirements

Review comment:
   For the test, I think you can merely instantiate the highlighter and 
grab the flags and inspect them.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] janhoy opened a new pull request #401: LUCENE-10174 Speed up 'pushLocal'

2021-10-19 Thread GitBox


janhoy opened a new pull request #401:
URL: https://github.com/apache/lucene/pull/401


   https://issues.apache.org/jira/browse/LUCENE-10174
   
   When copying files from `lucene/distribution/build/release` to the target 
directory, the script uses `tar.bz2`, i.e. with compression. This is super slow 
and usesless since files are already compressed. This PR uses plain `tar` 
without compression to greatly speed up this step.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10174) Update buildAndPushRelease.py for new gradle build

2021-10-19 Thread Jira


[ 
https://issues.apache.org/jira/browse/LUCENE-10174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430801#comment-17430801
 ] 

Jan Høydahl commented on LUCENE-10174:
--

See [GitHub Pull Request #401|https://github.com/apache/lucene/pull/401] for a 
nice speedup of the last step 'pushLocal'

> Update buildAndPushRelease.py for new gradle build
> --
>
> Key: LUCENE-10174
> URL: https://issues.apache.org/jira/browse/LUCENE-10174
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
> Fix For: main (9.0)
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> With LUCENE-9488 and LUCENE-10173 the gradle build was polished to properly 
> build source and binary artifacts, and sign those using either gpg tool or a 
> built-in java-based signing plugin. See 
> [https://github.com/apache/lucene/blob/main/help/publishing.txt]
> This jira will update {{buildAndPushRelease.py}} script to use the correct 
> build parameters. It will also add cmdline args to choose between gpg and 
> built-in (gpg default), and to supply the location of {{gpgHome}} if you do 
> not use gpg. We'll also add an option to NOT prompt for passphrase in the 
> python script, which will fallback to defaults (gpg-agent, env.vars or 
> gradle.properties).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9997) Revisit smoketester for 9.0 build

2021-10-19 Thread Tomoko Uchida (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430811#comment-17430811
 ] 

Tomoko Uchida commented on LUCENE-9997:
---

bq. I think we should make the git revision part of the distribution artifacts 
- then the smoke tester can read it directly from the distribution artifact 
release folder. Moreover, the git revision could also be part of the "source" 
distribution of Lucene - then the build scripts can be tweaked to actually work 
without the git clone (on the true "source" distribution) by simulating the git 
revision read from such a file.

+1 - if we are willing to refactor the huge smoketester script...

> Revisit smoketester for 9.0 build
> -
>
> Key: LUCENE-9997
> URL: https://issues.apache.org/jira/browse/LUCENE-9997
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Robert Muir
>Assignee: Jan Høydahl
>Priority: Major
> Fix For: main (9.0)
>
> Attachments: image-2021-10-12-12-47-11-480.png, 
> image-2021-10-12-12-48-15-373.png
>
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> Currently we have a (great) {{dev-tools/scripts/smokeTester.py}} that will 
> perform automated tests against a release.
> This was developed with the ant build process in mind.
> This issue is just about considering the automated checks we do here, maybe 
> some of them can be done efficiently in the gradle build in earlier places: 
> this would be a large improvement!
> Obviously some of them (e.g. GPG release key verifications) are really 
> specific to the artifacts in question. These are most important to release 
> verification, as that is actually the only place we can check it.
> Any other checks (and I do tend to think, this checker should try to be 
> thorough, invoking gradle etc), should be stuff we regularly test in 
> PRs/nightly/builds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10190) Assertion error in TestIndexWriter.testMaxCompletedSequenceNumber

2021-10-19 Thread Nhat Nguyen (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430833#comment-17430833
 ] 

Nhat Nguyen commented on LUCENE-10190:
--

I am looking at this failure.

> Assertion error in TestIndexWriter.testMaxCompletedSequenceNumber
> -
>
> Key: LUCENE-10190
> URL: https://issues.apache.org/jira/browse/LUCENE-10190
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Dawid Weiss
>Priority: Minor
>
> CI failure in PR at:
> https://github.com/apache/lucene/pull/396/checks?check_run_id=3936559246
> Does not reproduce. Stack below.
> {code}
> org.apache.lucene.index.TestIndexWriter > testMaxCompletedSequenceNumber 
> FAILED
> com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an 
> uncaught exception in thread: Thread[id=1840, name=Thread-1481, 
> state=RUNNABLE, group=TGRP-TestIndexWriter]
> at 
> __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C:671A014F243A0EDE]:0)
> Caused by:
> java.lang.AssertionError: expected:<1> but was:<0>
> at __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C]:0)
> at org.junit.Assert.fail(Assert.java:89)
> at org.junit.Assert.failNotEquals(Assert.java:835)
> at org.junit.Assert.assertEquals(Assert.java:647)
> at org.junit.Assert.assertEquals(Assert.java:633)
> at 
> org.apache.lucene.index.TestIndexWriter.lambda$testMaxCompletedSequenceNumber$53(TestIndexWriter.java:4305)
> at java.base/java.lang.Thread.run(Thread.java:829)
> org.apache.lucene.index.TestIndexWriter > test suite's output saved to 
> /home/runner/work/lucene/lucene/lucene/core/build/test-results/test/outputs/OUTPUT-org.apache.lucene.index.TestIndexWriter.txt,
>  copied below:
>   2> أكتوبر ١٩, ٢٠٢١ ١٠:٢٧:٢٧ ص 
> com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler
>  uncaughtException
>   2> WARNING: Uncaught exception in thread: 
> Thread[Thread-1481,5,TGRP-TestIndexWriter]
>   2> java.lang.AssertionError: expected:<1> but was:<0>
>   2>  at __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C]:0)
>   2>  at org.junit.Assert.fail(Assert.java:89)
>   2>  at org.junit.Assert.failNotEquals(Assert.java:835)
>   2>  at org.junit.Assert.assertEquals(Assert.java:647)
>   2>  at org.junit.Assert.assertEquals(Assert.java:633)
>   2>  at 
> org.apache.lucene.index.TestIndexWriter.lambda$testMaxCompletedSequenceNumber$53(TestIndexWriter.java:4305)
>   2>  at java.base/java.lang.Thread.run(Thread.java:829)
>   2> 
>> com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured 
> an uncaught exception in thread: Thread[id=1840, name=Thread-1481, 
> state=RUNNABLE, group=TGRP-TestIndexWriter]
>> at 
> __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C:671A014F243A0EDE]:0)
>> 
>> Caused by:
>> java.lang.AssertionError: expected:<1> but was:<0>
>> at __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C]:0)
>> at org.junit.Assert.fail(Assert.java:89)
>> at org.junit.Assert.failNotEquals(Assert.java:835)
>> at org.junit.Assert.assertEquals(Assert.java:647)
>> at org.junit.Assert.assertEquals(Assert.java:633)
>> at 
> org.apache.lucene.index.TestIndexWriter.lambda$testMaxCompletedSequenceNumber$53(TestIndexWriter.java:4305)
>> at java.base/java.lang.Thread.run(Thread.java:829)
>   2> NOTE: reproduce with: gradlew test --tests 
> TestIndexWriter.testMaxCompletedSequenceNumber -Dtests.seed=5B8EFE8DBEFB881C 
> -Dtests.badapples=true -Dtests.locale=ar-KW -Dtests.timezone=Africa/Lusaka 
> -Dtests.asserts=true -Dtests.file.encoding=UTF-8
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10191) Optimize vector functions by precomputing magnitudes

2021-10-19 Thread Mayya Sharipova (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430835#comment-17430835
 ] 

Mayya Sharipova commented on LUCENE-10191:
--

+1 great ideas to explore the performance boost.

> Optimize vector functions by precomputing magnitudes
> 
>
> Key: LUCENE-10191
> URL: https://issues.apache.org/jira/browse/LUCENE-10191
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Julie Tibshirani
>Priority: Minor
>
> Both euclidean distance (L2 norm) and cosine similarity can be expressed in 
> terms of dot product and vector magnitudes:
>  * l2_norm(a, b) = ||a - b|| = sqrt(||a||^2 - 2(a . b) + ||b||^2)
>  * cosine(a, b) = a . b / ||a|| ||b||
> We could compute and store each vector's magnitude upfront while indexing, 
> and compute the query vector's magnitude once per query. Then we'd calculate 
> the distance using our (very optimized) dot product method, plus the 
> precomputed values.
> This is an exploratory issue: I haven't tested this out yet, so I'm not sure 
> how much it would help. I would at least expect it to help with cosine 
> similarity – several months ago we tried out similar ideas in Elasticsearch 
> and were able to get a nice boost in cosine performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-10190) Assertion error in TestIndexWriter.testMaxCompletedSequenceNumber

2021-10-19 Thread Nhat Nguyen (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nhat Nguyen reassigned LUCENE-10190:


Assignee: Nhat Nguyen

> Assertion error in TestIndexWriter.testMaxCompletedSequenceNumber
> -
>
> Key: LUCENE-10190
> URL: https://issues.apache.org/jira/browse/LUCENE-10190
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Dawid Weiss
>Assignee: Nhat Nguyen
>Priority: Minor
>
> CI failure in PR at:
> https://github.com/apache/lucene/pull/396/checks?check_run_id=3936559246
> Does not reproduce. Stack below.
> {code}
> org.apache.lucene.index.TestIndexWriter > testMaxCompletedSequenceNumber 
> FAILED
> com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an 
> uncaught exception in thread: Thread[id=1840, name=Thread-1481, 
> state=RUNNABLE, group=TGRP-TestIndexWriter]
> at 
> __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C:671A014F243A0EDE]:0)
> Caused by:
> java.lang.AssertionError: expected:<1> but was:<0>
> at __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C]:0)
> at org.junit.Assert.fail(Assert.java:89)
> at org.junit.Assert.failNotEquals(Assert.java:835)
> at org.junit.Assert.assertEquals(Assert.java:647)
> at org.junit.Assert.assertEquals(Assert.java:633)
> at 
> org.apache.lucene.index.TestIndexWriter.lambda$testMaxCompletedSequenceNumber$53(TestIndexWriter.java:4305)
> at java.base/java.lang.Thread.run(Thread.java:829)
> org.apache.lucene.index.TestIndexWriter > test suite's output saved to 
> /home/runner/work/lucene/lucene/lucene/core/build/test-results/test/outputs/OUTPUT-org.apache.lucene.index.TestIndexWriter.txt,
>  copied below:
>   2> أكتوبر ١٩, ٢٠٢١ ١٠:٢٧:٢٧ ص 
> com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler
>  uncaughtException
>   2> WARNING: Uncaught exception in thread: 
> Thread[Thread-1481,5,TGRP-TestIndexWriter]
>   2> java.lang.AssertionError: expected:<1> but was:<0>
>   2>  at __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C]:0)
>   2>  at org.junit.Assert.fail(Assert.java:89)
>   2>  at org.junit.Assert.failNotEquals(Assert.java:835)
>   2>  at org.junit.Assert.assertEquals(Assert.java:647)
>   2>  at org.junit.Assert.assertEquals(Assert.java:633)
>   2>  at 
> org.apache.lucene.index.TestIndexWriter.lambda$testMaxCompletedSequenceNumber$53(TestIndexWriter.java:4305)
>   2>  at java.base/java.lang.Thread.run(Thread.java:829)
>   2> 
>> com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured 
> an uncaught exception in thread: Thread[id=1840, name=Thread-1481, 
> state=RUNNABLE, group=TGRP-TestIndexWriter]
>> at 
> __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C:671A014F243A0EDE]:0)
>> 
>> Caused by:
>> java.lang.AssertionError: expected:<1> but was:<0>
>> at __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C]:0)
>> at org.junit.Assert.fail(Assert.java:89)
>> at org.junit.Assert.failNotEquals(Assert.java:835)
>> at org.junit.Assert.assertEquals(Assert.java:647)
>> at org.junit.Assert.assertEquals(Assert.java:633)
>> at 
> org.apache.lucene.index.TestIndexWriter.lambda$testMaxCompletedSequenceNumber$53(TestIndexWriter.java:4305)
>> at java.base/java.lang.Thread.run(Thread.java:829)
>   2> NOTE: reproduce with: gradlew test --tests 
> TestIndexWriter.testMaxCompletedSequenceNumber -Dtests.seed=5B8EFE8DBEFB881C 
> -Dtests.badapples=true -Dtests.locale=ar-KW -Dtests.timezone=Africa/Lusaka 
> -Dtests.asserts=true -Dtests.file.encoding=UTF-8
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10190) Assertion error in TestIndexWriter.testMaxCompletedSequenceNumber

2021-10-19 Thread Nhat Nguyen (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430843#comment-17430843
 ] 

Nhat Nguyen commented on LUCENE-10190:
--

I can reproduce the issue by sleeping for a few ms before we increase the 
numDocsInRam 
([https://github.com/apache/lucene/commit/bbd24e865451d73fc228bd6a0b712508bd111b05).]
 I will be working on the fix.

> Assertion error in TestIndexWriter.testMaxCompletedSequenceNumber
> -
>
> Key: LUCENE-10190
> URL: https://issues.apache.org/jira/browse/LUCENE-10190
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Dawid Weiss
>Priority: Minor
>
> CI failure in PR at:
> https://github.com/apache/lucene/pull/396/checks?check_run_id=3936559246
> Does not reproduce. Stack below.
> {code}
> org.apache.lucene.index.TestIndexWriter > testMaxCompletedSequenceNumber 
> FAILED
> com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an 
> uncaught exception in thread: Thread[id=1840, name=Thread-1481, 
> state=RUNNABLE, group=TGRP-TestIndexWriter]
> at 
> __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C:671A014F243A0EDE]:0)
> Caused by:
> java.lang.AssertionError: expected:<1> but was:<0>
> at __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C]:0)
> at org.junit.Assert.fail(Assert.java:89)
> at org.junit.Assert.failNotEquals(Assert.java:835)
> at org.junit.Assert.assertEquals(Assert.java:647)
> at org.junit.Assert.assertEquals(Assert.java:633)
> at 
> org.apache.lucene.index.TestIndexWriter.lambda$testMaxCompletedSequenceNumber$53(TestIndexWriter.java:4305)
> at java.base/java.lang.Thread.run(Thread.java:829)
> org.apache.lucene.index.TestIndexWriter > test suite's output saved to 
> /home/runner/work/lucene/lucene/lucene/core/build/test-results/test/outputs/OUTPUT-org.apache.lucene.index.TestIndexWriter.txt,
>  copied below:
>   2> أكتوبر ١٩, ٢٠٢١ ١٠:٢٧:٢٧ ص 
> com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler
>  uncaughtException
>   2> WARNING: Uncaught exception in thread: 
> Thread[Thread-1481,5,TGRP-TestIndexWriter]
>   2> java.lang.AssertionError: expected:<1> but was:<0>
>   2>  at __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C]:0)
>   2>  at org.junit.Assert.fail(Assert.java:89)
>   2>  at org.junit.Assert.failNotEquals(Assert.java:835)
>   2>  at org.junit.Assert.assertEquals(Assert.java:647)
>   2>  at org.junit.Assert.assertEquals(Assert.java:633)
>   2>  at 
> org.apache.lucene.index.TestIndexWriter.lambda$testMaxCompletedSequenceNumber$53(TestIndexWriter.java:4305)
>   2>  at java.base/java.lang.Thread.run(Thread.java:829)
>   2> 
>> com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured 
> an uncaught exception in thread: Thread[id=1840, name=Thread-1481, 
> state=RUNNABLE, group=TGRP-TestIndexWriter]
>> at 
> __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C:671A014F243A0EDE]:0)
>> 
>> Caused by:
>> java.lang.AssertionError: expected:<1> but was:<0>
>> at __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C]:0)
>> at org.junit.Assert.fail(Assert.java:89)
>> at org.junit.Assert.failNotEquals(Assert.java:835)
>> at org.junit.Assert.assertEquals(Assert.java:647)
>> at org.junit.Assert.assertEquals(Assert.java:633)
>> at 
> org.apache.lucene.index.TestIndexWriter.lambda$testMaxCompletedSequenceNumber$53(TestIndexWriter.java:4305)
>> at java.base/java.lang.Thread.run(Thread.java:829)
>   2> NOTE: reproduce with: gradlew test --tests 
> TestIndexWriter.testMaxCompletedSequenceNumber -Dtests.seed=5B8EFE8DBEFB881C 
> -Dtests.badapples=true -Dtests.locale=ar-KW -Dtests.timezone=Africa/Lusaka 
> -Dtests.asserts=true -Dtests.file.encoding=UTF-8
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10190) Assertion error in TestIndexWriter.testMaxCompletedSequenceNumber

2021-10-19 Thread Nhat Nguyen (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430843#comment-17430843
 ] 

Nhat Nguyen edited comment on LUCENE-10190 at 10/20/21, 2:45 AM:
-

I can reproduce the issue by sleeping for a few ms before we increase the 
numDocsInRam ([ 
https://github.com/apache/lucene/commit/bbd24e865451d73fc228bd6a0b712508bd111b05
 
).|https://github.com/apache/lucene/commit/bbd24e865451d73fc228bd6a0b712508bd111b05).]
 I will be working on the fix.


was (Author: dnhatn):
I can reproduce the issue by sleeping for a few ms before we increase the 
numDocsInRam 
([https://github.com/apache/lucene/commit/bbd24e865451d73fc228bd6a0b712508bd111b05).]
 I will be working on the fix.

> Assertion error in TestIndexWriter.testMaxCompletedSequenceNumber
> -
>
> Key: LUCENE-10190
> URL: https://issues.apache.org/jira/browse/LUCENE-10190
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Dawid Weiss
>Assignee: Nhat Nguyen
>Priority: Minor
>
> CI failure in PR at:
> https://github.com/apache/lucene/pull/396/checks?check_run_id=3936559246
> Does not reproduce. Stack below.
> {code}
> org.apache.lucene.index.TestIndexWriter > testMaxCompletedSequenceNumber 
> FAILED
> com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an 
> uncaught exception in thread: Thread[id=1840, name=Thread-1481, 
> state=RUNNABLE, group=TGRP-TestIndexWriter]
> at 
> __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C:671A014F243A0EDE]:0)
> Caused by:
> java.lang.AssertionError: expected:<1> but was:<0>
> at __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C]:0)
> at org.junit.Assert.fail(Assert.java:89)
> at org.junit.Assert.failNotEquals(Assert.java:835)
> at org.junit.Assert.assertEquals(Assert.java:647)
> at org.junit.Assert.assertEquals(Assert.java:633)
> at 
> org.apache.lucene.index.TestIndexWriter.lambda$testMaxCompletedSequenceNumber$53(TestIndexWriter.java:4305)
> at java.base/java.lang.Thread.run(Thread.java:829)
> org.apache.lucene.index.TestIndexWriter > test suite's output saved to 
> /home/runner/work/lucene/lucene/lucene/core/build/test-results/test/outputs/OUTPUT-org.apache.lucene.index.TestIndexWriter.txt,
>  copied below:
>   2> أكتوبر ١٩, ٢٠٢١ ١٠:٢٧:٢٧ ص 
> com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler
>  uncaughtException
>   2> WARNING: Uncaught exception in thread: 
> Thread[Thread-1481,5,TGRP-TestIndexWriter]
>   2> java.lang.AssertionError: expected:<1> but was:<0>
>   2>  at __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C]:0)
>   2>  at org.junit.Assert.fail(Assert.java:89)
>   2>  at org.junit.Assert.failNotEquals(Assert.java:835)
>   2>  at org.junit.Assert.assertEquals(Assert.java:647)
>   2>  at org.junit.Assert.assertEquals(Assert.java:633)
>   2>  at 
> org.apache.lucene.index.TestIndexWriter.lambda$testMaxCompletedSequenceNumber$53(TestIndexWriter.java:4305)
>   2>  at java.base/java.lang.Thread.run(Thread.java:829)
>   2> 
>> com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured 
> an uncaught exception in thread: Thread[id=1840, name=Thread-1481, 
> state=RUNNABLE, group=TGRP-TestIndexWriter]
>> at 
> __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C:671A014F243A0EDE]:0)
>> 
>> Caused by:
>> java.lang.AssertionError: expected:<1> but was:<0>
>> at __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C]:0)
>> at org.junit.Assert.fail(Assert.java:89)
>> at org.junit.Assert.failNotEquals(Assert.java:835)
>> at org.junit.Assert.assertEquals(Assert.java:647)
>> at org.junit.Assert.assertEquals(Assert.java:633)
>> at 
> org.apache.lucene.index.TestIndexWriter.lambda$testMaxCompletedSequenceNumber$53(TestIndexWriter.java:4305)
>> at java.base/java.lang.Thread.run(Thread.java:829)
>   2> NOTE: reproduce with: gradlew test --tests 
> TestIndexWriter.testMaxCompletedSequenceNumber -Dtests.seed=5B8EFE8DBEFB881C 
> -Dtests.badapples=true -Dtests.locale=ar-KW -Dtests.timezone=Africa/Lusaka 
> -Dtests.asserts=true -Dtests.file.encoding=UTF-8
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10190) Assertion error in TestIndexWriter.testMaxCompletedSequenceNumber

2021-10-19 Thread Nhat Nguyen (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430843#comment-17430843
 ] 

Nhat Nguyen edited comment on LUCENE-10190 at 10/20/21, 2:46 AM:
-

I can reproduce the issue by sleeping for a few ms before we increase the 
numDocsInRam 
([https://github.com/apache/lucene/commit/bbd24e865451d73fc228bd6a0b712508bd111b05).|https://github.com/apache/lucene/commit/bbd24e865451d73fc228bd6a0b712508bd111b05]
 I will be working on the fix.


was (Author: dnhatn):
I can reproduce the issue by sleeping for a few ms before we increase the 
numDocsInRam ([ 
https://github.com/apache/lucene/commit/bbd24e865451d73fc228bd6a0b712508bd111b05
 
).|https://github.com/apache/lucene/commit/bbd24e865451d73fc228bd6a0b712508bd111b05).]
 I will be working on the fix.

> Assertion error in TestIndexWriter.testMaxCompletedSequenceNumber
> -
>
> Key: LUCENE-10190
> URL: https://issues.apache.org/jira/browse/LUCENE-10190
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Dawid Weiss
>Assignee: Nhat Nguyen
>Priority: Minor
>
> CI failure in PR at:
> https://github.com/apache/lucene/pull/396/checks?check_run_id=3936559246
> Does not reproduce. Stack below.
> {code}
> org.apache.lucene.index.TestIndexWriter > testMaxCompletedSequenceNumber 
> FAILED
> com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an 
> uncaught exception in thread: Thread[id=1840, name=Thread-1481, 
> state=RUNNABLE, group=TGRP-TestIndexWriter]
> at 
> __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C:671A014F243A0EDE]:0)
> Caused by:
> java.lang.AssertionError: expected:<1> but was:<0>
> at __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C]:0)
> at org.junit.Assert.fail(Assert.java:89)
> at org.junit.Assert.failNotEquals(Assert.java:835)
> at org.junit.Assert.assertEquals(Assert.java:647)
> at org.junit.Assert.assertEquals(Assert.java:633)
> at 
> org.apache.lucene.index.TestIndexWriter.lambda$testMaxCompletedSequenceNumber$53(TestIndexWriter.java:4305)
> at java.base/java.lang.Thread.run(Thread.java:829)
> org.apache.lucene.index.TestIndexWriter > test suite's output saved to 
> /home/runner/work/lucene/lucene/lucene/core/build/test-results/test/outputs/OUTPUT-org.apache.lucene.index.TestIndexWriter.txt,
>  copied below:
>   2> أكتوبر ١٩, ٢٠٢١ ١٠:٢٧:٢٧ ص 
> com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler
>  uncaughtException
>   2> WARNING: Uncaught exception in thread: 
> Thread[Thread-1481,5,TGRP-TestIndexWriter]
>   2> java.lang.AssertionError: expected:<1> but was:<0>
>   2>  at __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C]:0)
>   2>  at org.junit.Assert.fail(Assert.java:89)
>   2>  at org.junit.Assert.failNotEquals(Assert.java:835)
>   2>  at org.junit.Assert.assertEquals(Assert.java:647)
>   2>  at org.junit.Assert.assertEquals(Assert.java:633)
>   2>  at 
> org.apache.lucene.index.TestIndexWriter.lambda$testMaxCompletedSequenceNumber$53(TestIndexWriter.java:4305)
>   2>  at java.base/java.lang.Thread.run(Thread.java:829)
>   2> 
>> com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured 
> an uncaught exception in thread: Thread[id=1840, name=Thread-1481, 
> state=RUNNABLE, group=TGRP-TestIndexWriter]
>> at 
> __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C:671A014F243A0EDE]:0)
>> 
>> Caused by:
>> java.lang.AssertionError: expected:<1> but was:<0>
>> at __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C]:0)
>> at org.junit.Assert.fail(Assert.java:89)
>> at org.junit.Assert.failNotEquals(Assert.java:835)
>> at org.junit.Assert.assertEquals(Assert.java:647)
>> at org.junit.Assert.assertEquals(Assert.java:633)
>> at 
> org.apache.lucene.index.TestIndexWriter.lambda$testMaxCompletedSequenceNumber$53(TestIndexWriter.java:4305)
>> at java.base/java.lang.Thread.run(Thread.java:829)
>   2> NOTE: reproduce with: gradlew test --tests 
> TestIndexWriter.testMaxCompletedSequenceNumber -Dtests.seed=5B8EFE8DBEFB881C 
> -Dtests.badapples=true -Dtests.locale=ar-KW -Dtests.timezone=Africa/Lusaka 
> -Dtests.asserts=true -Dtests.file.encoding=UTF-8
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10190) Assertion error in TestIndexWriter.testMaxCompletedSequenceNumber

2021-10-19 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430980#comment-17430980
 ] 

Dawid Weiss commented on LUCENE-10190:
--

Thank you, [~dnhatn]!

> Assertion error in TestIndexWriter.testMaxCompletedSequenceNumber
> -
>
> Key: LUCENE-10190
> URL: https://issues.apache.org/jira/browse/LUCENE-10190
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Dawid Weiss
>Assignee: Nhat Nguyen
>Priority: Minor
>
> CI failure in PR at:
> https://github.com/apache/lucene/pull/396/checks?check_run_id=3936559246
> Does not reproduce. Stack below.
> {code}
> org.apache.lucene.index.TestIndexWriter > testMaxCompletedSequenceNumber 
> FAILED
> com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an 
> uncaught exception in thread: Thread[id=1840, name=Thread-1481, 
> state=RUNNABLE, group=TGRP-TestIndexWriter]
> at 
> __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C:671A014F243A0EDE]:0)
> Caused by:
> java.lang.AssertionError: expected:<1> but was:<0>
> at __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C]:0)
> at org.junit.Assert.fail(Assert.java:89)
> at org.junit.Assert.failNotEquals(Assert.java:835)
> at org.junit.Assert.assertEquals(Assert.java:647)
> at org.junit.Assert.assertEquals(Assert.java:633)
> at 
> org.apache.lucene.index.TestIndexWriter.lambda$testMaxCompletedSequenceNumber$53(TestIndexWriter.java:4305)
> at java.base/java.lang.Thread.run(Thread.java:829)
> org.apache.lucene.index.TestIndexWriter > test suite's output saved to 
> /home/runner/work/lucene/lucene/lucene/core/build/test-results/test/outputs/OUTPUT-org.apache.lucene.index.TestIndexWriter.txt,
>  copied below:
>   2> أكتوبر ١٩, ٢٠٢١ ١٠:٢٧:٢٧ ص 
> com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler
>  uncaughtException
>   2> WARNING: Uncaught exception in thread: 
> Thread[Thread-1481,5,TGRP-TestIndexWriter]
>   2> java.lang.AssertionError: expected:<1> but was:<0>
>   2>  at __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C]:0)
>   2>  at org.junit.Assert.fail(Assert.java:89)
>   2>  at org.junit.Assert.failNotEquals(Assert.java:835)
>   2>  at org.junit.Assert.assertEquals(Assert.java:647)
>   2>  at org.junit.Assert.assertEquals(Assert.java:633)
>   2>  at 
> org.apache.lucene.index.TestIndexWriter.lambda$testMaxCompletedSequenceNumber$53(TestIndexWriter.java:4305)
>   2>  at java.base/java.lang.Thread.run(Thread.java:829)
>   2> 
>> com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured 
> an uncaught exception in thread: Thread[id=1840, name=Thread-1481, 
> state=RUNNABLE, group=TGRP-TestIndexWriter]
>> at 
> __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C:671A014F243A0EDE]:0)
>> 
>> Caused by:
>> java.lang.AssertionError: expected:<1> but was:<0>
>> at __randomizedtesting.SeedInfo.seed([5B8EFE8DBEFB881C]:0)
>> at org.junit.Assert.fail(Assert.java:89)
>> at org.junit.Assert.failNotEquals(Assert.java:835)
>> at org.junit.Assert.assertEquals(Assert.java:647)
>> at org.junit.Assert.assertEquals(Assert.java:633)
>> at 
> org.apache.lucene.index.TestIndexWriter.lambda$testMaxCompletedSequenceNumber$53(TestIndexWriter.java:4305)
>> at java.base/java.lang.Thread.run(Thread.java:829)
>   2> NOTE: reproduce with: gradlew test --tests 
> TestIndexWriter.testMaxCompletedSequenceNumber -Dtests.seed=5B8EFE8DBEFB881C 
> -Dtests.badapples=true -Dtests.locale=ar-KW -Dtests.timezone=Africa/Lusaka 
> -Dtests.asserts=true -Dtests.file.encoding=UTF-8
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-10192) Drop third-party JARs from the binary distribution

2021-10-19 Thread Dawid Weiss (Jira)
Dawid Weiss created LUCENE-10192:


 Summary: Drop third-party JARs from the binary distribution
 Key: LUCENE-10192
 URL: https://issues.apache.org/jira/browse/LUCENE-10192
 Project: Lucene - Core
  Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss


[~janhoy] Are we ready (with respect to scripts) for this change? I'd like to 
do it but I'm not sure whether the release wizard doesn't depend on it somehow 
(I will handle buildAndPushRelease.py and smokeTestRelease.py if they need 
fixes but I'm not sure about the releaseWizard.*).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10084) Rewrite DocValuesFieldExistsQuery to a MatchAllDocsQuery when docCount == maxDoc

2021-10-19 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17431010#comment-17431010
 ] 

Adrien Grand commented on LUCENE-10084:
---

Please feel free to give it a try!

> Rewrite DocValuesFieldExistsQuery to a MatchAllDocsQuery when docCount == 
> maxDoc
> 
>
> Key: LUCENE-10084
> URL: https://issues.apache.org/jira/browse/LUCENE-10084
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
>
> Now that we require all documents to use the same features (LUCENE-9334) we 
> could rewrite DocValuesFieldExistsQuery to a MatchAllDocsQuery whenever terms 
> or points have a docCount that is equal to maxDoc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



  1   2   >