subject:"Search across a specified number of boundaries"

Re: Search across a specified number of boundaries

2013-01-15 Thread Mike Ree

Mikhail, Yeah, I considered that originally, but then after analyzing the data noticed that was not possible. Some of the content we analyze contains large tables that after ocr get turned into long running sentences which contain 500k+ words per a sentence. Overall there are probably around 10k o

Re: Search across a specified number of boundaries

2013-01-14 Thread Mikhail Khludnev

Mike, When Lucene's Analyser indexes the text it adds positions into the index which are lately used by SpanQueries. Have you considered idea of position increment gap? e.g. the first sentence is indexed with words positions: 0,1,2,3,... the second sentence with 100,101,102,103,..., third 200,201,