[ https://issues.apache.org/jira/browse/LUCENE-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099163#comment-17099163 ]
ASF subversion and git services commented on LUCENE-9191: --------------------------------------------------------- Commit 1783c4ad47990d1a88ac3bb44b2da2c2d2abcc79 in lucene-solr's branch refs/heads/master from Michael McCandless [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=1783c4a ] LUCENE-9191: ensure LineFileDocs random seeking effort does not seek into the middle of a multi-byte UTF-8 encoded Unicode character > Fix linefiledocs compression or replace in tests > ------------------------------------------------ > > Key: LUCENE-9191 > URL: https://issues.apache.org/jira/browse/LUCENE-9191 > Project: Lucene - Core > Issue Type: Task > Reporter: Robert Muir > Assignee: Michael McCandless > Priority: Major > Fix For: 8.6 > > Attachments: LUCENE-9191.patch, LUCENE-9191.patch, LUCENE-9191.patch > > > LineFileDocs(random) is very slow, even to open. It does a very slow "random > skip" through a gzip compressed file. > For the analyzers tests, in LUCENE-9186 I simply removed its usage, since > TestUtil.randomAnalysisString is superior, and fast. But we should address > other tests using it, since LineFileDocs(random) is slow! > I think it is also the case that every lucene test has probably tested every > LineFileDocs line many times now, whereas randomAnalysisString will invent > new ones. > Alternatively, we could "fix" LineFileDocs(random), e.g. special compression > options (in blocks)... deflate supports such stuff. But it would make it even > hairier than it is now. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org