[ https://issues.apache.org/jira/browse/LUCENE-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17383683#comment-17383683 ]
Mayya Sharipova commented on LUCENE-9450: ----------------------------------------- I am not familiar how taxonomy works, so you adding migration instructions would be very helpful. Answering your question: > If the user has 4 old StoredFields based segments and 1 new BinaryDocValues > based segment (like the way you just described), does force merging them to 1 > produce a single BinaryDocValues based segment? We have `FieldInfo` class for each field with a distinct name, and this class contains information how this field is indexed (for example with BinaryDocValues). From v 9.0, there is a check that in each segment of the index each field is indexed the same way, you can't have a field indexed with BinaryDocValues in one segment, and without BinaryDocValues in another segment. In 8.x you can still have this option, and when you merge segments to a single segment, the resulting segment will record this field as indexed with BinaryDocValues. This check in v 9.0 is for each for each individual field, so if you transition 4 fields from stored fields to binary doc values, a separate transition needs to be done for each field. > reindexing documents as compared to force merging segments to one, so I guess >the recommended approach for them would be to just reindex. Reindexing should work as well, if it not an expensive operation for users. > Taxonomy index should use DocValues not StoredFields > ---------------------------------------------------- > > Key: LUCENE-9450 > URL: https://issues.apache.org/jira/browse/LUCENE-9450 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet > Affects Versions: 8.5.2 > Reporter: Gautam Worah > Priority: Minor > Labels: performance > Fix For: main (9.0) > > Attachments: LUCENE-9450-localrun.py-v1, wip_taxonomy_patch > > Time Spent: 3h 50m > Remaining Estimate: 0h > > The taxonomy index that maps binning labels to ordinals was created before > Lucene added BinaryDocValues. > I've attached a WIP patch (does not pass tests currently) > Issue suggested by [~mikemccand] -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org