mayya-sharipova opened a new issue, #13103: URL: https://github.com/apache/lucene/issues/13103
### Description UnifiedHighlighter based on matches incorrectly returns field 'X' was indexed without offsets, cannot highlight Test to reproduce: ```java static final FieldType textType = new FieldType(TextField.TYPE_STORED); static { textType.setStoreTermVectors(true); textType.setStoreTermVectorPositions(true); textType.setStoreTermVectorOffsets(true); textType.freeze(); } public void testHighlgiht() { String indexPath = "../lucene-test-indices/index1"; Path path = Paths.get(indexPath); try { Directory directory = NIOFSDirectory.open(path); Analyzer analyzer = new ClassicAnalyzer(); IndexWriterConfig config = new IndexWriterConfig(analyzer); try (IndexWriter writer = new IndexWriter(directory, config)) { addDoc(writer, "The quick brown fox jumps over the lazy dog"); } try (IndexReader reader = DirectoryReader.open(directory)) { IndexSearcher searcher = new IndexSearcher(reader); Query query = new IntervalQuery("content", Intervals.analyzedText("quick brown fox jumps over the lazy dog", analyzer, "content", 0, true)); TopDocs topDocs = searcher.search(query, 10); UnifiedHighlighter.Builder uhBuilder = new UnifiedHighlighter.Builder(searcher, analyzer) .withWeightMatches(true); UnifiedHighlighter highlighter = new UnifiedHighlighter(uhBuilder); String[] highlights = highlighter.highlight("content", query, topDocs, 1); System.out.println(Arrays.toString(highlights)); } } catch (IOException e) { e.printStackTrace(); } } private static void addDoc(IndexWriter writer, String content) throws IOException { Document doc = new Document(); doc.add(new Field("content", content, textType)); writer.addDocument(doc); } ``` produces an error: ``` java.lang.IllegalArgumentException: field 'content' was indexed without offsets, cannot highlight at org.apache.lucene.search.uhighlight.FieldHighlighter.highlightOffsetsEnums(FieldHighlighter.java:157) at org.apache.lucene.search.uhighlight.FieldHighlighter.highlightFieldForDoc(FieldHighlighter.java:83) at org.apache.lucene.search.uhighlight.UnifiedHighlighter.highlightFieldsAsObjects(UnifiedHighlighter.java:944) at org.apache.lucene.search.uhighlight.UnifiedHighlighter.highlightFields(UnifiedHighlighter.java:814) at org.apache.lucene.search.uhighlight.UnifiedHighlighter.highlightFields(UnifiedHighlighter.java:792) at org.apache.lucene.search.uhighlight.UnifiedHighlighter.highlight(UnifiedHighlighter.java:725) ``` A workaround to disable highlighting based on matches: ```java UnifiedHighlighter.Builder uhBuilder = new UnifiedHighlighter.Builder(searcher, analyzer) .withWeightMatches(false); ``` This happens because of `ClassicAnalyzer` that removes stop words, and because of it usage of `ExtendedIntervalsSource` that returns -1 offsets. ### Version and environment details Lucene v 9.9.1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org