Re: Tracking down the input that hits an analysis chain bug

2014-01-16 Thread Benson Margulies
I think that https://issues.apache.org/jira/browse/SOLR-5623 should be ready to go. Would someone please commit from the PR? If there's a preference, I can attach a patch as well. On Fri, Jan 10, 2014 at 1:37 PM, Benson Margulies wrote: > Thanks, that's the recipe that I need. > > On Fri, Jan 10,

Re: Tracking down the input that hits an analysis chain bug

2014-01-10 Thread Benson Margulies
Thanks, that's the recipe that I need. On Fri, Jan 10, 2014 at 11:40 AM, Chris Hostetter wrote: > > : Is there a neighborhood of existing tests I should be visiting here? > > You'll need a custom schema that refers to your new > MockFailOnCertainTokensFilterFactory, so i would create a completley

Re: Tracking down the input that hits an analysis chain bug

2014-01-10 Thread Chris Hostetter
: Is there a neighborhood of existing tests I should be visiting here? You'll need a custom schema that refers to your new MockFailOnCertainTokensFilterFactory, so i would create a completley new test class somewhere in ...solr.update (you're testing that an update fails with a clean error)

Re: Tracking down the input that hits an analysis chain bug

2014-01-10 Thread Benson Margulies
Is there a neighborhood of existing tests I should be visiting here? On Fri, Jan 10, 2014 at 11:27 AM, Benson Margulies wrote: > OK, patch forthcoming. > > On Fri, Jan 10, 2014 at 11:23 AM, Chris Hostetter > wrote: >> >> : The problem manifests as this sort of thing: >> : >> : Jan 3, 2014 6:05:

Re: Tracking down the input that hits an analysis chain bug

2014-01-10 Thread Benson Margulies
OK, patch forthcoming. On Fri, Jan 10, 2014 at 11:23 AM, Chris Hostetter wrote: > > : The problem manifests as this sort of thing: > : > : Jan 3, 2014 6:05:33 PM org.apache.solr.common.SolrException log > : SEVERE: java.lang.IllegalArgumentException: startOffset must be > : non-negative, and endO

Re: Tracking down the input that hits an analysis chain bug

2014-01-10 Thread Chris Hostetter
: The problem manifests as this sort of thing: : : Jan 3, 2014 6:05:33 PM org.apache.solr.common.SolrException log : SEVERE: java.lang.IllegalArgumentException: startOffset must be : non-negative, and endOffset must be >= startOffset, : startOffset=-1811581632,endOffset=-1811581632 Is there a st

Re: Tracking down the input that hits an analysis chain bug

2014-01-05 Thread Michael Sokolov
I think you do (or can) get a log message for each document insert? If that's all you need, I think logging configuration will get you there. I use log4j and turn Solr's pretty verbose logging off using: log4j.logger.org.apache.lucene.solr = WARN assuming the rest of log4j is set up OK, I th

Re: Tracking down the input that hits an analysis chain bug

2014-01-04 Thread Benson Margulies
I rather assumed that there was some log4j-ish config to be set that would do this for me. Lacking one, I guess I'll end up there. On Fri, Jan 3, 2014 at 8:23 PM, Michael Sokolov wrote: > Have you considered using a custom UpdateProcessor to catch the exception > and provide more context in the l

Re: Tracking down the input that hits an analysis chain bug

2014-01-03 Thread Michael Sokolov
Have you considered using a custom UpdateProcessor to catch the exception and provide more context in the logs? -Mike On 01/03/2014 03:33 PM, Benson Margulies wrote: Robert, Yes, if the problem was not data-dependent, indeed I wouldn't need to index anything. However, I've run a small mountai

Re: Tracking down the input that hits an analysis chain bug

2014-01-03 Thread Benson Margulies
Robert, Yes, if the problem was not data-dependent, indeed I wouldn't need to index anything. However, I've run a small mountain of data through our tokenizer on my machine, and never seen the error, but my customer gets these errors in the middle of a giant spew of data. As it happens, I _was_ mi

Re: Tracking down the input that hits an analysis chain bug

2014-01-03 Thread Robert Muir
This exception comes from OffsetAttributeImpl (e.g. you dont need to index anything to reproduce it). Maybe you have a missing clearAttributes() call (your tokenizer 'returns true' without calling that first)? This could explain it, if something like a StopFilter is also present in the chain: basi