[ https://issues.apache.org/jira/browse/LUCENE-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17179244#comment-17179244 ]
Michael McCandless commented on LUCENE-9450: -------------------------------------------- Whoa, that is good news! It looks like the change is net/net a speedup pure faceting tasks! Those {{Browse*}} tasks are essentially a {{MatchAllDocsQuery}} counting facets for all hits. However, those QPS numbers are silly high and not really trustworthy. Which {{-source}} did you use? Can you run with {{-source wikimediumall}}? That will index all ~33.3M documents. Also, since you had to benchmark two indices (since this change impacts the index), can you add {{forceMerge = True}} to your two indices? Otherwise each index might have different segment geometries making the query performance not very comparable. With {{forceMerge = True}} they will both merge down to one segment, making the QPS numbers at least comparable if not perhaps realistic in a production setting. > Taxonomy index should use DocValues not StoredFields > ---------------------------------------------------- > > Key: LUCENE-9450 > URL: https://issues.apache.org/jira/browse/LUCENE-9450 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet > Affects Versions: 8.5.2 > Reporter: Gautam Worah > Priority: Minor > Labels: performance > Attachments: wip_taxonomy_patch > > Time Spent: 1.5h > Remaining Estimate: 0h > > The taxonomy index that maps binning labels to ordinals was created before > Lucene added BinaryDocValues. > I've attached a WIP patch (does not pass tests currently) > Issue suggested by [~mikemccand] -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org