[ https://issues.apache.org/jira/browse/LUCENE-9795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287996#comment-17287996 ]
ASF subversion and git services commented on LUCENE-9795: --------------------------------------------------------- Commit 107926e486f8cd6bbfc8abb055c9f58fe56f9cbb in lucene-solr's branch refs/heads/master from Robert Muir [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=107926e ] LUCENE-9795: fix CheckIndex not to validate SortedDocValues as if they were BinaryDocValues CheckIndex already validates SortedDocValues properly: reads every document's ordinal and validates derefing all the ordinals back to bytes from the terms dictionary. It should not do an additional (very slow) pass where it treats the field as if it were binary (doc -> ord -> byte[]), this is slow and doesn't validate any additional index data. Now that the term dictionary of SortedDocValues may be compressed, it is especially slow to misuse the docvalues field in this way. > investigate large checkindex/grouping regression in nightly benchmarks > ---------------------------------------------------------------------- > > Key: LUCENE-9795 > URL: https://issues.apache.org/jira/browse/LUCENE-9795 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Robert Muir > Priority: Major > Attachments: LUCENE-9795.patch, > Screen_Shot_2021-02-21_at_09.17.53.png, Screen_Shot_2021-02-21_at_09.30.30.png > > > In the nightly benchmark, checkindex times increased more than 4x on the 2/16 > datapoint > Looking at the commits on 2/15, most obvious thing to look into is docvalues > terms dict compression: LUCENE-9663 > Will try to pinpoint it more, my concern is some perf bug such as every > single term causing decompression of the whole block repeatedly (missing > seek-within-block opto?) -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org