[ https://issues.apache.org/jira/browse/LUCENE-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17041689#comment-17041689 ]
Adrien Grand commented on LUCENE-9236: -------------------------------------- Agreed with Robert regarding the abstractions. There is a part of your change that I liked though, where you were creating BinaryEntry/NumericEntry/... on the Consumer side as well, which made the Consumer and Producer look more symmetric. > Having a modular Doc Values format > ---------------------------------- > > Key: LUCENE-9236 > URL: https://issues.apache.org/jira/browse/LUCENE-9236 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index > Reporter: juan camilo rodriguez duran > Priority: Minor > Labels: docValues > > Today DocValues Consumer/Producer require override 5 different methods, even > if you only want to use one and given that one given field can only support > one doc values type at same time. > > In the attached PR I’ve implemented a new modular version of those classes > (consumer/producer) each one having a single responsibility and writing in > the same unique file. > This is mainly a refactor of the existing format opening the possibility to > override or implement the sub-format you need. > > I’ll do in 3 steps: > # Create a CompositeDocValuesFormat and moving the code of > Lucene80DocValuesFormat in separate classes, without modifying the inner > code. At same time I created a Lucene85CompositeDocValuesFormat based on > these changes. > # I’ll introduce some basic components for writing doc values in general > such as: > ## DocumentIdSetIterator Serializer: used in each type of field based on an > IndexedDISI. > ## Document Ordinals Serializer: Used in Sorted and SortedSet for > deduplicate values using a dictionary. > ## Document Boundaries Serializer (optional used only for multivalued > fields: SortedNumeric and SortedSet) > ## TermsEnum Serializer: useful to write and read the terms dictionary for > sorted and sorted set doc values. > # I’ll create the new Sub-DocValues format using the previous components. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org