[ https://issues.apache.org/jira/browse/LUCENE-9257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17051043#comment-17051043 ]
Bruno Roustant edited comment on LUCENE-9257 at 3/4/20, 9:36 AM: ----------------------------------------------------------------- Removing FSTLoadMode enum is tempting, but there are several cases handled with FSTLoadMode.AUTO: - force on-heap for ID field. - force on-heap for SegmentReadState opened from a writer. - off-heap only for IndexInput instanceof ByteBufferIndexInput. So your benchmarks showed very small overhead for indexing. What about ID field? Also are there important use-cases with IndexInput not instanceof ByteBufferIndexInput? I'm not comfortable with always loading the FST off-heap. But maybe remove the FSTLoadMode enum and always let the logic behind FSTLoadMode.AUTO? Or replace FSTLoadMode by a simple boolean fstOnHeap (otherwise it is auto determined with the above logic)? was (Author: broustant): Removing FSTLoadMode enum is tempting, but there are several cases handled with FSTLoadMode.AUTO: - force on-heap for ID field. - force on-heap for SegmentReadState opened from a writer. - off-heap only for IndexInput instanceof ByteBufferIndexInput. So your benchmarks showed very small overhead for indexing. What about ID field? Also are there important use-cases with IndexInput not instanceof ByteBufferIndexInput? I'm not comfortable with always loading the FST off-heap. But maybe remove the FSTLoadMode enum and always let the logic behind FSTLoadMode.AUTO? Or replace FSTLoadMode by a simple boolean forceOnHeap (otherwise it is auto determined with the above logic)? > FSTLoadMode should not be BlockTree specific as it is used more generally in > index package > ------------------------------------------------------------------------------------------ > > Key: LUCENE-9257 > URL: https://issues.apache.org/jira/browse/LUCENE-9257 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Bruno Roustant > Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > FSTLoadMode and its associate attribute key (static String) are currently > defined in BlockTreeTermsReader, but they are actually used outside of > BlockTree in the general "index" package. > CheckIndex and ReadersAndUpdates are using these enum and attribute key to > drive the FST load mode through the SegmentReader which is not specific to a > postings format. They have an unnecessary dependency to BlockTreeTermsReader. > We could move FSTLoadMode out of BlockTreeTermsReader, to make it a public > enum of the "index" package. That way CheckIndex and ReadersAndUpdates do not > import anymore BlockTreeTermsReader. > This would also allow other postings formats to use the same enum (e.g. > LUCENE-9254) -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org