[ https://issues.apache.org/jira/browse/LUCENE-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17094921#comment-17094921 ]
Chris M. Hostetter edited comment on LUCENE-9191 at 4/28/20, 11:21 PM: ----------------------------------------------------------------------- Apache jenkins found some reproducing failures originating from {{BasePostingsFormatTestCase.testInvertedWrite}} that seem suspiciuosly related to this issue. The seeds alone don't seem to reproduce for me, but i didn't try downloading & using the same {{tests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master/test-data/enwiki.random.lines.txt}} used by jenkins... https://builds.apache.org/view/L/view/Lucene/job/Lucene-Solr-NightlyTests-master/2170/ {noformat} > git rev-parse --is-inside-work-tree # timeout=10 Fetching changes from the remote Git repository > git config remote.origin.url > https://gitbox.apache.org/repos/asf/lucene-solr.git # timeout=10 Fetching upstream changes from https://gitbox.apache.org/repos/asf/lucene-solr.git > git --version # timeout=10 > git fetch --tags --progress > https://gitbox.apache.org/repos/asf/lucene-solr.git > +refs/heads/*:refs/remotes/origin/* > git rev-parse refs/remotes/origin/master^{commit} # timeout=10 > git rev-parse refs/remotes/origin/origin/master^{commit} # timeout=10 Checking out Revision e0c06ee6a6db925efa40b6633869c800d5745261 (refs/remotes/origin/master) > git config core.sparsecheckout # timeout=10 > git checkout -f e0c06ee6a6db925efa40b6633869c800d5745261 Commit message: "LUCENE-9191: make LineFileDocs random seeking more efficient by recording safe skip points in the concatenated gzip'd chunks" > git rev-list --no-walk c94770c2b9c00ccdc2d617d595d62f85a332dc0c # timeout=10 Cleaning workspace > git rev-parse --verify HEAD # timeout=10 Resetting working tree > git reset --hard # timeout=10 > git clean -fdx # timeout=10 Cleaning up /home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master/test-data Updating http://svn.apache.org/repos/asf/lucene/test-data at revision '2020-04-22T04:32:15.464 +0000' At revision 1876810 No emails were triggered. [checkout] $ /home/jenkins/tools/ant/apache-ant-1.8.4/bin/ant -file build.xml -Dtests.multiplier=2 -Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master/test-data /enwiki.random.lines.txt jenkins-nightly Buildfile: /home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master/checkout/build.xml jenkins-nightly: -print-java-info: [java-info] java version "11.0.4" [java-info] Java(TM) SE Runtime Environment (11.0.4+10-LTS, Oracle Corporation) [java-info] Java HotSpot(TM) 64-Bit Server VM (11.0.4+10-LTS, Oracle Corporation) [java-info] Test args: [-XX:TieredStopAtLevel=1] ... [junit4] 2> NOTE: download the large Jenkins line-docs file by running 'ant get-jenkins-line-docs' in the lucene directory. [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=TestDirectPostingsFormat -Dtests.method=testInvertedWrite -Dtests.seed=172C6414BE5E2A2C -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.s low=true -Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master/test-data/enwiki.random.lines.txt -Dtests.locale=shi-Tfng-MA -Dtests.timezone=SystemV/MST7 -Dtests. asserts=true -Dtests.file.encoding=US-ASCII [junit4] ERROR 0.03s J2 | TestDirectPostingsFormat.testInvertedWrite <<< [junit4] > Throwable #1: java.nio.charset.MalformedInputException: Input length = 1 [junit4] > at __randomizedtesting.SeedInfo.seed([172C6414BE5E2A2C:E5829DFC005A1F0]:0) [junit4] > at java.base/java.nio.charset.CoderResult.throwException(CoderResult.java:274) [junit4] > at java.base/sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:339) [junit4] > at java.base/sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) [junit4] > at java.base/java.io.InputStreamReader.read(InputStreamReader.java:185) [junit4] > at java.base/java.io.BufferedReader.fill(BufferedReader.java:161) [junit4] > at java.base/java.io.BufferedReader.readLine(BufferedReader.java:326) [junit4] > at java.base/java.io.BufferedReader.readLine(BufferedReader.java:392) [junit4] > at org.apache.lucene.util.LineFileDocs.open(LineFileDocs.java:175) [junit4] > at org.apache.lucene.util.LineFileDocs.<init>(LineFileDocs.java:65) [junit4] > at org.apache.lucene.util.LineFileDocs.<init>(LineFileDocs.java:69) [junit4] > at org.apache.lucene.index.BasePostingsFormatTestCase.testInvertedWrite(BasePostingsFormatTestCase.java:540) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [junit4] > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit4] > at java.base/java.lang.reflect.Method.invoke(Method.java:566) [junit4] > at java.base/java.lang.Thread.run(Thread.java:834) [junit4] 1> [_0.cfe, _0.cfs, _0.si, segments_1] ... [junit4] 2> NOTE: download the large Jenkins line-docs file by running 'ant get-jenkins-line-docs' in the lucene directory. [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=TestBloomPostingsFormat -Dtests.method=testInvertedWrite -Dtests.seed=172C6414BE5E2A2C -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master/test-data/enwiki.random.lines.txt -Dtests.locale=vai-Vaii -Dtests.timezone=SystemV/HST10 -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] ERROR 0.01s J1 | TestBloomPostingsFormat.testInvertedWrite <<< [junit4] > Throwable #1: java.nio.charset.MalformedInputException: Input length = 1 [junit4] > at __randomizedtesting.SeedInfo.seed([172C6414BE5E2A2C:E5829DFC005A1F0]:0) [junit4] > at java.base/java.nio.charset.CoderResult.throwException(CoderResult.java:274) [junit4] > at java.base/sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:339) [junit4] > at java.base/sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) [junit4] > at java.base/java.io.InputStreamReader.read(InputStreamReader.java:185) [junit4] > at java.base/java.io.BufferedReader.fill(BufferedReader.java:161) [junit4] > at java.base/java.io.BufferedReader.readLine(BufferedReader.java:326) [junit4] > at java.base/java.io.BufferedReader.readLine(BufferedReader.java:392) [junit4] > at org.apache.lucene.util.LineFileDocs.open(LineFileDocs.java:175) [junit4] > at org.apache.lucene.util.LineFileDocs.<init>(LineFileDocs.java:65) [junit4] > at org.apache.lucene.util.LineFileDocs.<init>(LineFileDocs.java:69) [junit4] > at org.apache.lucene.index.BasePostingsFormatTestCase.testInvertedWrite(BasePostingsFormatTestCase.java:540) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [junit4] > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit4] > at java.base/java.lang.reflect.Method.invoke(Method.java:566) [junit4] > at java.base/java.lang.Thread.run(Thread.java:834) [junit4] 1> [_0.cfe, _0.cfs, _0.si, segments_1] ... [junit4] 2> NOTE: download the large Jenkins line-docs file by running 'ant get-jenkins-line-docs' in the lucene directory. [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=TestVarGapDocFreqIntervalPostingsFormat -Dtests.method=testInvertedWrite -Dtests.seed=172C6414BE5E2A2C -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master/test-data/enwiki.random.lines.txt -Dtests.locale=uz-Cyrl-UZ -Dtests.timezone=America/Maceio -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] ERROR 0.01s J1 | TestVarGapDocFreqIntervalPostingsFormat.testInvertedWrite <<< [junit4] > Throwable #1: java.nio.charset.MalformedInputException: Input length = 1 [junit4] > at __randomizedtesting.SeedInfo.seed([172C6414BE5E2A2C:E5829DFC005A1F0]:0) [junit4] > at java.base/java.nio.charset.CoderResult.throwException(CoderResult.java:274) [junit4] > at java.base/sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:339) [junit4] > at java.base/sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) [junit4] > at java.base/java.io.InputStreamReader.read(InputStreamReader.java:185) [junit4] > at java.base/java.io.BufferedReader.fill(BufferedReader.java:161) [junit4] > at java.base/java.io.BufferedReader.readLine(BufferedReader.java:326) [junit4] > at java.base/java.io.BufferedReader.readLine(BufferedReader.java:392) [junit4] > at org.apache.lucene.util.LineFileDocs.open(LineFileDocs.java:175) [junit4] > at org.apache.lucene.util.LineFileDocs.<init>(LineFileDocs.java:65) [junit4] > at org.apache.lucene.util.LineFileDocs.<init>(LineFileDocs.java:69) [junit4] > at org.apache.lucene.index.BasePostingsFormatTestCase.testInvertedWrite(BasePostingsFormatTestCase.java:540) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [junit4] > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit4] > at java.base/java.lang.reflect.Method.invoke(Method.java:566) [junit4] > at java.base/java.lang.Thread.run(Thread.java:834) [junit4] 1> [_0.cfe, _0.cfs, _0.si, segments_2] ... etc... {noformat} Repro build found all of these seeds reproduce reliably on jenkins... https://builds.apache.org/view/L/view/Lucene/job/Lucene-Solr-repro-Java11/1042/ was (Author: hossman): Apache jenkins found some reproducing failures originating from {{BasePostingsFormatTestCase.testInvertedWrite}} that seem suspiciuosly related to this issue. The seeds alone don't seem to reproduce for me, but i didn't try downloading & using the same {{tests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master/test-data/enwiki.random.lines.txt}} used by jenkins... https://builds.apache.org/view/L/view/Lucene/job/Lucene-Solr-NightlyTests-master/2170/ {noformat} > git rev-parse --is-inside-work-tree # timeout=10 Fetching changes from the remote Git repository > git config remote.origin.url > https://gitbox.apache.org/repos/asf/lucene-solr.git # timeout=10 Fetching upstream changes from https://gitbox.apache.org/repos/asf/lucene-solr.git > git --version # timeout=10 > git fetch --tags --progress > https://gitbox.apache.org/repos/asf/lucene-solr.git > +refs/heads/*:refs/remotes/origin/* > git rev-parse refs/remotes/origin/master^{commit} # timeout=10 > git rev-parse refs/remotes/origin/origin/master^{commit} # timeout=10 Checking out Revision e0c06ee6a6db925efa40b6633869c800d5745261 (refs/remotes/origin/master) > git config core.sparsecheckout # timeout=10 > git checkout -f e0c06ee6a6db925efa40b6633869c800d5745261 Commit message: "LUCENE-9191: make LineFileDocs random seeking more efficient by recording safe skip points in the concatenated gzip'd chunks" > git rev-list --no-walk c94770c2b9c00ccdc2d617d595d62f85a332dc0c # timeout=10 Cleaning workspace > git rev-parse --verify HEAD # timeout=10 Resetting working tree > git reset --hard # timeout=10 > git clean -fdx # timeout=10 Cleaning up /home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master/test-data Updating http://svn.apache.org/repos/asf/lucene/test-data at revision '2020-04-22T04:32:15.464 +0000' At revision 1876810 No emails were triggered. [checkout] $ /home/jenkins/tools/ant/apache-ant-1.8.4/bin/ant -file build.xml -Dtests.multiplier=2 -Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master/test-data /enwiki.random.lines.txt jenkins-nightly Buildfile: /home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master/checkout/build.xml jenkins-nightly: -print-java-info: [java-info] java version "11.0.4" [java-info] Java(TM) SE Runtime Environment (11.0.4+10-LTS, Oracle Corporation) [java-info] Java HotSpot(TM) 64-Bit Server VM (11.0.4+10-LTS, Oracle Corporation) [java-info] Test args: [-XX:TieredStopAtLevel=1] ... [junit4] 2> NOTE: download the large Jenkins line-docs file by running 'ant get-jenkins-line-docs' in the lucene directory. [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=TestDirectPostingsFormat -Dtests.method=testInvertedWrite -Dtests.seed=172C6414BE5E2A2C -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.s low=true -Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master/test-data/enwiki.random.lines.txt -Dtests.locale=shi-Tfng-MA -Dtests.timezone=SystemV/MST7 -Dtests. asserts=true -Dtests.file.encoding=US-ASCII [junit4] ERROR 0.03s J2 | TestDirectPostingsFormat.testInvertedWrite <<< [junit4] > Throwable #1: java.nio.charset.MalformedInputException: Input length = 1 [junit4] > at __randomizedtesting.SeedInfo.seed([172C6414BE5E2A2C:E5829DFC005A1F0]:0) [junit4] > at java.base/java.nio.charset.CoderResult.throwException(CoderResult.java:274) [junit4] > at java.base/sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:339) [junit4] > at java.base/sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) [junit4] > at java.base/java.io.InputStreamReader.read(InputStreamReader.java:185) [junit4] > at java.base/java.io.BufferedReader.fill(BufferedReader.java:161) [junit4] > at java.base/java.io.BufferedReader.readLine(BufferedReader.java:326) [junit4] > at java.base/java.io.BufferedReader.readLine(BufferedReader.java:392) [junit4] > at org.apache.lucene.util.LineFileDocs.open(LineFileDocs.java:175) [junit4] > at org.apache.lucene.util.LineFileDocs.<init>(LineFileDocs.java:65) [junit4] > at org.apache.lucene.util.LineFileDocs.<init>(LineFileDocs.java:69) [junit4] > at org.apache.lucene.index.BasePostingsFormatTestCase.testInvertedWrite(BasePostingsFormatTestCase.java:540) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [junit4] > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit4] > at java.base/java.lang.reflect.Method.invoke(Method.java:566) [junit4] > at java.base/java.lang.Thread.run(Thread.java:834) [junit4] 1> [_0.cfe, _0.cfs, _0.si, segments_1] ... [junit4] 2> NOTE: download the large Jenkins line-docs file by running 'ant get-jenkins-line-docs' in the lucene directory. [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=TestBloomPostingsFormat -Dtests.method=testInvertedWrite -Dtests.seed=172C6414BE5E2A2C -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master/test-data/enwiki.random.lines.txt -Dtests.locale=vai-Vaii -Dtests.timezone=SystemV/HST10 -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] ERROR 0.01s J1 | TestBloomPostingsFormat.testInvertedWrite <<< [junit4] > Throwable #1: java.nio.charset.MalformedInputException: Input length = 1 [junit4] > at __randomizedtesting.SeedInfo.seed([172C6414BE5E2A2C:E5829DFC005A1F0]:0) [junit4] > at java.base/java.nio.charset.CoderResult.throwException(CoderResult.java:274) [junit4] > at java.base/sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:339) [junit4] > at java.base/sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) [junit4] > at java.base/java.io.InputStreamReader.read(InputStreamReader.java:185) [junit4] > at java.base/java.io.BufferedReader.fill(BufferedReader.java:161) [junit4] > at java.base/java.io.BufferedReader.readLine(BufferedReader.java:326) [junit4] > at java.base/java.io.BufferedReader.readLine(BufferedReader.java:392) [junit4] > at org.apache.lucene.util.LineFileDocs.open(LineFileDocs.java:175) [junit4] > at org.apache.lucene.util.LineFileDocs.<init>(LineFileDocs.java:65) [junit4] > at org.apache.lucene.util.LineFileDocs.<init>(LineFileDocs.java:69) [junit4] > at org.apache.lucene.index.BasePostingsFormatTestCase.testInvertedWrite(BasePostingsFormatTestCase.java:540) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [junit4] > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit4] > at java.base/java.lang.reflect.Method.invoke(Method.java:566) [junit4] > at java.base/java.lang.Thread.run(Thread.java:834) [junit4] 1> [_0.cfe, _0.cfs, _0.si, segments_1] ... [junit4] 2> NOTE: download the large Jenkins line-docs file by running 'ant get-jenkins-line-docs' in the lucene directory. [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=TestVarGapDocFreqIntervalPostingsFormat -Dtests.method=testInvertedWrite -Dtests.seed=172C6414BE5E2A2C -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master/test-data/enwiki.random.lines.txt -Dtests.locale=uz-Cyrl-UZ -Dtests.timezone=America/Maceio -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] ERROR 0.01s J1 | TestVarGapDocFreqIntervalPostingsFormat.testInvertedWrite <<< [junit4] > Throwable #1: java.nio.charset.MalformedInputException: Input length = 1 [junit4] > at __randomizedtesting.SeedInfo.seed([172C6414BE5E2A2C:E5829DFC005A1F0]:0) [junit4] > at java.base/java.nio.charset.CoderResult.throwException(CoderResult.java:274) [junit4] > at java.base/sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:339) [junit4] > at java.base/sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) [junit4] > at java.base/java.io.InputStreamReader.read(InputStreamReader.java:185) [junit4] > at java.base/java.io.BufferedReader.fill(BufferedReader.java:161) [junit4] > at java.base/java.io.BufferedReader.readLine(BufferedReader.java:326) [junit4] > at java.base/java.io.BufferedReader.readLine(BufferedReader.java:392) [junit4] > at org.apache.lucene.util.LineFileDocs.open(LineFileDocs.java:175) [junit4] > at org.apache.lucene.util.LineFileDocs.<init>(LineFileDocs.java:65) [junit4] > at org.apache.lucene.util.LineFileDocs.<init>(LineFileDocs.java:69) [junit4] > at org.apache.lucene.index.BasePostingsFormatTestCase.testInvertedWrite(BasePostingsFormatTestCase.java:540) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [junit4] > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit4] > at java.base/java.lang.reflect.Method.invoke(Method.java:566) [junit4] > at java.base/java.lang.Thread.run(Thread.java:834) [junit4] 1> [_0.cfe, _0.cfs, _0.si, segments_2] ... etc... {noformat} > Fix linefiledocs compression or replace in tests > ------------------------------------------------ > > Key: LUCENE-9191 > URL: https://issues.apache.org/jira/browse/LUCENE-9191 > Project: Lucene - Core > Issue Type: Task > Reporter: Robert Muir > Assignee: Michael McCandless > Priority: Major > Fix For: 8.6 > > Attachments: LUCENE-9191.patch, LUCENE-9191.patch > > > LineFileDocs(random) is very slow, even to open. It does a very slow "random > skip" through a gzip compressed file. > For the analyzers tests, in LUCENE-9186 I simply removed its usage, since > TestUtil.randomAnalysisString is superior, and fast. But we should address > other tests using it, since LineFileDocs(random) is slow! > I think it is also the case that every lucene test has probably tested every > LineFileDocs line many times now, whereas randomAnalysisString will invent > new ones. > Alternatively, we could "fix" LineFileDocs(random), e.g. special compression > options (in blocks)... deflate supports such stuff. But it would make it even > hairier than it is now. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org