For #2 you might be able to get away with the following: https://cwiki.apache.org/confluence/display/solr/The+Term+Vector+Component
The Term Vector component can return offsets and positions. Not sure how useful they would be to you, but at least is a starting point. I'm assuming this requires only termVecotrs and termPositions and won't require stored to be true. Kevin Risden On Tue, Nov 29, 2016 at 12:00 PM, Kevin Risden <compuwizard...@gmail.com> wrote: > For #3 specifically, I've always found this page useful: > > https://cwiki.apache.org/confluence/display/solr/Field+ > Properties+by+Use+Case > > It lists out what properties are necessary on each field based on a use > case. > > Kevin Risden > > On Tue, Nov 29, 2016 at 11:49 AM, Erick Erickson <erickerick...@gmail.com> > wrote: > >> (1) No that I have readily at hand. And to make it >> worse, there's the UnifiedHighlighter coming out soon.... >> >> I don't think there's a good way for (2). >> >> for (3) at least yes. The reason is simple. For analyzed text, >> the only thing in the index is what's made it through the >> analysis chains. So stopwords are missing. Stemming >> has been done. You could even have put a phonetic filter >> in there and have terms like ARDT KNTR which would >> be...er...not very useful to show the end user so the original >> text must be available. >> >> >> >> >> Not much help... >> Erick >> >> On Tue, Nov 29, 2016 at 8:43 AM, John Bickerstaff >> <j...@johnbickerstaff.com> wrote: >> > All, >> > >> > One of the questions I've been asked to answer / prove out is around the >> > question of highlighting query matches in responses. >> > >> > BTW - One assumption I'm making is that highlighting is basically a >> > function of storing offsets for terms / tokens at index time. If that's >> > not right, I'd be grateful for pointers in the right direction. >> > >> > My underlying need is to get highlighting on search term matches for >> > returned documents. I need to choose between doing this in Solr and >> using >> > an external document store, so I'm interested in whether Solr can >> provide >> > the doc store with the information necessary to identify which >> section(s) >> > of the doc to highlight in a query response... >> > >> > A few questions: >> > >> > 1. This page doesn't say a lot about how things work - is there >> somewhere >> > with more information on dealing with offsets and highlighting? On >> offsets >> > and how they're handled? >> > https://cwiki.apache.org/confluence/display/solr/Highlighting >> > >> > 2. Can I return offset information with a query response or is that >> > internal only? If yes, can I return offset info if I have NOT stored >> the >> > data in Solr but indexed only? >> > >> > (Explanation: Currently my project is considering indexing only and >> storing >> > the entire text elsewhere -- using Solr to return only doc ID's for >> > searches. If Solr could also return offsets, these could be used in >> > processing the text stored elsewhere to provide highlighting) >> > >> > 3. Do I assume correctly that in order for Solr highlighting to work >> > correctly, the text MUST also be stored in Solr (I.E. not indexed only, >> but >> > stored=true) >> > >> > Many thanks... >> > >