[
https://issues.apache.org/jira/browse/LUCENE-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276599#comment-17276599
]
Julie Tibshirani commented on LUCENE-9705:
------------------------------------------
{quote}It's especially clear here where we must copy a lot of classes with no
change at all, merely to clearly and consistently document the index version
change.
{quote}
I’ll try to add some context since I suspect there might be misunderstanding.
In general when there is a new major version, we *do not* plan to create all
new index format classes. We only copy a class and move it to backwards-codecs
when there is a change to that specific format, for example {{PointsFormat}}.
This proposal applies only to the 9.0 release, and its main purpose is to
support the work in LUCENE-9047 to move all formats to little endian. My
understanding is that moving to little endian impacts all the formats and will
be much cleaner if we used these fresh {{Lucene90*Format}}.
{quote}I wonder if we (eventually) should consider shifting to a versioning
system that doesn't require new classes. Is this somehow a feature of the
service discovery API that we use?
{quote}
We indeed load codecs (with their formats) through a service discovery API. If
a user wants to read indices from a previous major version, they can depend on
backwards-codecs so Lucene loads the correct older codec. As of LUCENE-9669, we
allow reading indices back to version N-2.
I personally really like the current "copy-on-write" system for formats.
There’s code duplication, but it has advantages over combining different
version logic in the same file:
* It’s really clear how each version behaves. Having a direct copy like
\{{Lucene70Codec} is almost as if we were pulling in the codec jars from Lucene
7.0.
* It decreases risk of introducing bugs or accidental changes. If you’re
making an enhancement to a new format, there’s little chance of changing the
logic for an old format (since it lives in a separate class). This is
especially important since older formats are not tested as thoroughly.
I started to appreciate it after experiencing the alternative in Elasticsearch,
where we’re constantly bumping into if/ else version checks when making changes.
> Move all codec formats to the o.a.l.codecs.Lucene90 package
> -----------------------------------------------------------
>
> Key: LUCENE-9705
> URL: https://issues.apache.org/jira/browse/LUCENE-9705
> Project: Lucene - Core
> Issue Type: Wish
> Reporter: Ignacio Vera
> Priority: Major
> Time Spent: 50m
> Remaining Estimate: 0h
>
> Current formats are distributed in different packages, prefixed with the
> Lucene version they were created. With the upcoming release of Lucene 9.0, it
> would be nice to move all those formats to just the o.a.l.codecs.Lucene90
> package (and of course moving the current ones to the backwards-codecs).
> This issue would actually facilitate moving the directory API to little
> endian (LUCENE-9047) as the only codecs that would need to handle backwards
> compatibility will be the codecs in backwards codecs.
> In addition, it can help formalising the use of internal versions vs format
> versioning ( LUCENE-9616)
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]