[
https://issues.apache.org/jira/browse/LUCENE-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17027607#comment-17027607
]
Robert Muir commented on LUCENE-9191:
-------------------------------------
Basically we would not compress it as one giant thing but into blocks (in a way
similar to how stored fields are done).
It would still be a regular gzip file... it would just have some safe places
you could seek to quickly?
maybe its simple and we just recompress it with a tool such as
https://zlib.net/pigz/ (that is how parallel gzip works)
> Fix linefiledocs compression or replace in tests
> ------------------------------------------------
>
> Key: LUCENE-9191
> URL: https://issues.apache.org/jira/browse/LUCENE-9191
> Project: Lucene - Core
> Issue Type: Task
> Reporter: Robert Muir
> Priority: Major
>
> LineFileDocs(random) is very slow, even to open. It does a very slow "random
> skip" through a gzip compressed file.
> For the analyzers tests, in LUCENE-9186 I simply removed its usage, since
> TestUtil.randomAnalysisString is superior, and fast. But we should address
> other tests using it, since LineFileDocs(random) is slow!
> I think it is also the case that every lucene test has probably tested every
> LineFileDocs line many times now, whereas randomAnalysisString will invent
> new ones.
> Alternatively, we could "fix" LineFileDocs(random), e.g. special compression
> options (in blocks)... deflate supports such stuff. But it would make it even
> hairier than it is now.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]