On Fri, Mar 2, 2012 at 7:37 AM, andrew <and...@digicol.de> wrote: > I was able to create a test case. > > We are querying ranges of documents. When I tried to isolate the document > that causes trouble, I found it happens with exactly every second request > only for a single document query (it fails constantly when requesting a > range of documents where that document is included). I could also reproduce > the exception with only that single document in the index. > > I think it is not a good idea to post the Solr <add/> XML here - it is very > long (text extract of a newspaper page) and may not reproduce verbatim > (whitespace etc.) if I paste it here. > > iorixxx, koji - is it ok if I send the necessary artifacts (add XML, schema, > config) via email? >
You can also open a jira issue (https://issues.apache.org/jira/browse/SOLR), and upload everything as attachments. I would also be very interested if you can test a nightly 3.6 build (https://builds.apache.org/job/Solr-3.x/lastSuccessfulBuild/artifact/artifacts/) There have been *numerous* offsets bugs fixed in 3.6 in a variety of tokenizers/tokenfilters besides the HTMLStripCharFilter: https://issues.apache.org/jira/browse/LUCENE-3642 https://issues.apache.org/jira/browse/SOLR-2891 https://issues.apache.org/jira/browse/LUCENE-3717 -- lucidimagination.com