Hello michael, you are not on lucene 4.8? https://issues.apache.org/jira/plugins/servlet/mobile#issue/LUCENE-5111
Michael Sokolov <msoko...@safaribooksonline.com> schreef:For posterity, in case anybody follows this thread, I tracked the problem down to WordDelimiterFilter; apparently it creates an offset of -1 in some case, which PostingsHighlighter rejects. -Mike On 5/2/2014 10:20 AM, Michael Sokolov wrote: > I checked using the analysis admin page, and I believe there are > offsets being generated (I assume start/end=offsets). So IDK I am > going to try reindexing again. Maybe I neglected to reload the config > before I indexed last time. > > -Mike > > On 05/02/2014 09:34 AM, Michael Sokolov wrote: >> I've been wanting to try out the PostingsHighlighter, so I added >> storeOffsetsWithPositions to my field definition, enabled the >> highlighter in solrconfig.xml, reindexed and tried it out. When I >> issue a query I'm getting this error: >> >> |field 'text' was indexed without offsets, cannot highlight >> >> >> java.lang.IllegalArgumentException: field 'text' was indexed without >> offsets, cannot highlight >> at >> org.apache.lucene.search.postingshighlight.PostingsHighlighter.highlightDoc(PostingsHighlighter.java:545) >> at >> org.apache.lucene.search.postingshighlight.PostingsHighlighter.highlightField(PostingsHighlighter.java:467) >> at >> org.apache.lucene.search.postingshighlight.PostingsHighlighter.highlightFieldsAsObjects(PostingsHighlighter.java:392) >> at >> org.apache.lucene.search.postingshighlight.PostingsHighlighter.highlightFields(PostingsHighlighter.java:293)| >> I've been trying to figure out why the field wouldn't have offsets >> indexed, but I just can't see it. Is there something in the analysis >> chain that could stripping out offsets? >> >> >> This is the field definition: >> >> <field name="text" type="text_en" indexed="true" stored="true" >> multiValued="false" termVectors="true" termPositions="true" >> termOffsets="true" storeOffsetsWithPositions="true" /> >> >> (Yes I know PH doesn't require term vectors; I'm keeping them around >> for now while I experiment) >> >> <fieldType name="text_en" class="solr.TextField" >> positionIncrementGap="100"> >> <analyzer type="index"> >> <!-- We are indexing mostly HTML so we need to ignore the >> tags --> >> <charFilter class="solr.HTMLStripCharFilterFactory"/> >> <!--<tokenizer class="solr.StandardTokenizerFactory"/>--> >> <tokenizer class="solr.WhitespaceTokenizerFactory"/> >> <!-- lower casing must happen before WordDelimiterFilter or >> protwords.txt will not work --> >> <filter class="solr.LowerCaseFilterFactory"/> >> <filter class="solr.WordDelimiterFilterFactory" >> stemEnglishPossessive="1" protected="protwords.txt"/> >> <!-- This deals with contractions --> >> <filter class="solr.SynonymFilterFactory" >> synonyms="synonyms.txt" expand="true" ignoreCase="true"/> >> <filter class="solr.HunspellStemFilterFactory" >> dictionary="en_US.dic" affix="en_US.aff" ignoreCase="true"/> >> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> >> </analyzer> >> <analyzer type="query"> >> <!--<tokenizer class="solr.StandardTokenizerFactory"/>--> >> <tokenizer class="solr.WhitespaceTokenizerFactory"/> >> <!-- lower casing must happen before WordDelimiterFilter or >> protwords.txt will not work --> >> <filter class="solr.LowerCaseFilterFactory"/> >> <filter class="solr.WordDelimiterFilterFactory" >> protected="protwords.txt"/> >> <!-- setting tokenSeparator="" solves issues with compound >> words and improves phrase search --> >> <filter class="solr.HunspellStemFilterFactory" >> dictionary="en_US.dic" affix="en_US.aff" ignoreCase="true"/> >> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> >> </analyzer> >> </fieldType> >