[ 
https://issues.apache.org/jira/browse/LUCENE-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276599#comment-17276599
 ] 

Julie Tibshirani commented on LUCENE-9705:
------------------------------------------

{quote}It's especially clear here where we must copy a lot of classes with no 
change at all, merely to clearly and consistently document the index version 
change.
{quote}
I’ll try to add some context since I suspect there might be misunderstanding. 
In general when there is a new major version, we *do not* plan to create all 
new index format classes. We only copy a class and move it to backwards-codecs 
when there is a change to that specific format, for example {{PointsFormat}}. 
This proposal applies only to the 9.0 release, and its main purpose is to 
support the work in LUCENE-9047 to move all formats to little endian. My 
understanding is that moving to little endian impacts all the formats and will 
be much cleaner if we used these fresh {{Lucene90*Format}}.
{quote}I wonder if we (eventually) should consider shifting to a versioning 
system that doesn't require new classes. Is this somehow a feature of the 
service discovery API that we use?
{quote}
We indeed load codecs (with their formats) through a service discovery API. If 
a user wants to read indices from a previous major version, they can depend on 
backwards-codecs so Lucene loads the correct older codec. As of LUCENE-9669, we 
allow reading indices back to version N-2.

I personally really like the current "copy-on-write" system for formats. 
There’s code duplication, but it has advantages over combining different 
version logic in the same file:
 * It’s really clear how each version behaves. Having a direct copy like 
\{{Lucene70Codec} is almost as if we were pulling in the codec jars from Lucene 
7.0.
 * It decreases risk of introducing bugs or accidental changes. If you’re 
making an enhancement to a new format, there’s little chance of changing the 
logic for an old format (since it lives in a separate class). This is 
especially important since older formats are not tested as thoroughly.

I started to appreciate it after experiencing the alternative in Elasticsearch, 
where we’re constantly bumping into if/ else version checks when making changes.

> Move all codec formats to the o.a.l.codecs.Lucene90 package
> -----------------------------------------------------------
>
>                 Key: LUCENE-9705
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9705
>             Project: Lucene - Core
>          Issue Type: Wish
>            Reporter: Ignacio Vera
>            Priority: Major
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> Current formats are distributed in different packages, prefixed with the 
> Lucene version they were created. With the upcoming release of Lucene 9.0, it 
> would be nice to move all those formats to just the o.a.l.codecs.Lucene90 
> package (and of course moving the current ones to the backwards-codecs).
> This issue would actually facilitate moving the directory API to little 
> endian (LUCENE-9047) as the only codecs that would need to handle backwards 
> compatibility will be the codecs in backwards codecs.
> In addition, it can help formalising the use of internal versions vs format 
> versioning ( LUCENE-9616)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to