[ 
https://issues.apache.org/jira/browse/LUCENE-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17383683#comment-17383683
 ] 

Mayya Sharipova commented on LUCENE-9450:
-----------------------------------------

I am not familiar how taxonomy works, so you adding migration instructions 
would be very helpful.

Answering your question:

>  If the user has 4 old StoredFields based segments and 1 new BinaryDocValues 
> based segment (like the way you just described), does force merging them to 1 
> produce a single BinaryDocValues based segment?

We have `FieldInfo` class for each field with a distinct name, and this class 
contains information how this field is indexed (for example with 
BinaryDocValues).  From v 9.0, there is a check that in each segment of the 
index each field is indexed the same way, you can't have a field indexed with 
BinaryDocValues in one segment, and without BinaryDocValues in another segment. 
 In 8.x you can still have this option, and when you merge segments to a single 
segment, the resulting segment will record this field as indexed with 
BinaryDocValues. 

This check in v 9.0 is for each for each individual field, so if you transition 
4 fields from stored fields to binary doc values, a separate transition needs 
to be done for each field. 

 

> reindexing documents as compared to force merging segments to one, so I guess 
>the recommended approach for them would be to just reindex.

 

Reindexing should work as well, if it not an expensive operation for users. 

 

> Taxonomy index should use DocValues not StoredFields
> ----------------------------------------------------
>
>                 Key: LUCENE-9450
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9450
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/facet
>    Affects Versions: 8.5.2
>            Reporter: Gautam Worah
>            Priority: Minor
>              Labels: performance
>             Fix For: main (9.0)
>
>         Attachments: LUCENE-9450-localrun.py-v1, wip_taxonomy_patch
>
>          Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> The taxonomy index that maps binning labels to ordinals was created before 
> Lucene added BinaryDocValues.
> I've attached a WIP patch (does not pass tests currently)
> Issue suggested by [~mikemccand]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to