[ https://issues.apache.org/jira/browse/LUCENE-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276599#comment-17276599 ]
Julie Tibshirani commented on LUCENE-9705: ------------------------------------------ {quote}It's especially clear here where we must copy a lot of classes with no change at all, merely to clearly and consistently document the index version change. {quote} I’ll try to add some context since I suspect there might be misunderstanding. In general when there is a new major version, we *do not* plan to create all new index format classes. We only copy a class and move it to backwards-codecs when there is a change to that specific format, for example {{PointsFormat}}. This proposal applies only to the 9.0 release, and its main purpose is to support the work in LUCENE-9047 to move all formats to little endian. My understanding is that moving to little endian impacts all the formats and will be much cleaner if we used these fresh {{Lucene90*Format}}. {quote}I wonder if we (eventually) should consider shifting to a versioning system that doesn't require new classes. Is this somehow a feature of the service discovery API that we use? {quote} We indeed load codecs (with their formats) through a service discovery API. If a user wants to read indices from a previous major version, they can depend on backwards-codecs so Lucene loads the correct older codec. As of LUCENE-9669, we allow reading indices back to version N-2. I personally really like the current "copy-on-write" system for formats. There’s code duplication, but it has advantages over combining different version logic in the same file: * It’s really clear how each version behaves. Having a direct copy like \{{Lucene70Codec} is almost as if we were pulling in the codec jars from Lucene 7.0. * It decreases risk of introducing bugs or accidental changes. If you’re making an enhancement to a new format, there’s little chance of changing the logic for an old format (since it lives in a separate class). This is especially important since older formats are not tested as thoroughly. I started to appreciate it after experiencing the alternative in Elasticsearch, where we’re constantly bumping into if/ else version checks when making changes. > Move all codec formats to the o.a.l.codecs.Lucene90 package > ----------------------------------------------------------- > > Key: LUCENE-9705 > URL: https://issues.apache.org/jira/browse/LUCENE-9705 > Project: Lucene - Core > Issue Type: Wish > Reporter: Ignacio Vera > Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > Current formats are distributed in different packages, prefixed with the > Lucene version they were created. With the upcoming release of Lucene 9.0, it > would be nice to move all those formats to just the o.a.l.codecs.Lucene90 > package (and of course moving the current ones to the backwards-codecs). > This issue would actually facilitate moving the directory API to little > endian (LUCENE-9047) as the only codecs that would need to handle backwards > compatibility will be the codecs in backwards codecs. > In addition, it can help formalising the use of internal versions vs format > versioning ( LUCENE-9616) > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org