I forgot: this concerns the Solr 1.3.0 release. On Wed, Sep 17, 2008 at 4:15 PM, dojolava <[EMAIL PROTECTED]> wrote:
> Hi, > > if I want to highlight a mutivalued field I get the following exception: > > String index out of range: 21 java.lang.StringIndexOutOfBoundsException: > String index out of range: 21 at java.lang.String.substring(Unknown Source) > at > org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:240) > > at > org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:372) > > > I think this is because solr gets the TokenStream by: > > // attempt term vectors > tstream = TokenSources.getTokenStream(searcher.getReader(), docId, > fieldName); > > so in the tstream are all values, but the actual value is in docTexts[j] > and both are passed to the highlighter: > > TextFragment[] bestTextFragments = > highlighter.getBestTextFragments(tstream, docTexts[j], > mergeContiguousFragments, numFragments); > > thus the highlighter tries to substring a token that does not exist in the > text. > > It works if I use > > // fall back to anaylzer > > tstream = new TokenOrderingFilter(schema.getAnalyzer().tokenStream(fieldName, > new StringReader(docTexts[j])),10); > > > > Regards, > Mathis >