I rather assumed that there was some log4j-ish config to be set that would do this for me. Lacking one, I guess I'll end up there.
On Fri, Jan 3, 2014 at 8:23 PM, Michael Sokolov <msoko...@safaribooksonline.com> wrote: > Have you considered using a custom UpdateProcessor to catch the exception > and provide more context in the logs? > > -Mike > > > On 01/03/2014 03:33 PM, Benson Margulies wrote: >> >> Robert, >> >> Yes, if the problem was not data-dependent, indeed I wouldn't need to >> index anything. However, I've run a small mountain of data through our >> tokenizer on my machine, and never seen the error, but my customer >> gets these errors in the middle of a giant spew of data. As it >> happens, I _was_ missing that call to clearAttributes(), (and the >> usual implementation of end()), but I found and fixed that problem >> precisely by creating a random data test case using checkRandomData(). >> Unfortunately, fixing that didn't make the customer's errors go away. >> >> So I'm left needing to help them identify the data that provokes this, >> because I've so far failed to come up with any. >> >> --benson >> >> >> On Fri, Jan 3, 2014 at 2:16 PM, Robert Muir <rcm...@gmail.com> wrote: >>> >>> This exception comes from OffsetAttributeImpl (e.g. you dont need to >>> index anything to reproduce it). >>> >>> Maybe you have a missing clearAttributes() call (your tokenizer >>> 'returns true' without calling that first)? This could explain it, if >>> something like a StopFilter is also present in the chain: basically >>> the offsets overflow. >>> >>> the test stuff in BaseTokenStreamTestCase should be able to detect >>> this as well... >>> >>> On Fri, Jan 3, 2014 at 1:56 PM, Benson Margulies <ben...@basistech.com> >>> wrote: >>>> >>>> Using Solr Cloud with 4.3.1. >>>> >>>> We've got a problem with a tokenizer that manifests as calling >>>> OffsetAtt.setOffsets() with invalid inputs. OK, so, we want to figure >>>> out >>>> what input provokes our code into getting into this pickle. >>>> >>>> The problem happens on SolrCloud nodes. >>>> >>>> The problem manifests as this sort of thing: >>>> >>>> Jan 3, 2014 6:05:33 PM org.apache.solr.common.SolrException log >>>> SEVERE: java.lang.IllegalArgumentException: startOffset must be >>>> non-negative, and endOffset must be >= startOffset, >>>> startOffset=-1811581632,endOffset=-1811581632 >>>> >>>> How could we get a document ID so that we can tell which document was >>>> being >>>> processed? > >