Re: Solr exceptions during batch indexing

2014-11-08 Thread Erick Erickson
bq: Just trying to understand what's the challenge in returning the bad doc Mostly, nobody has done it yet. There's some complication about async updates, ConcurrentUpdateSolrServer for instance. I suspect also that one has to write error handling logic in the client anyway so the motivation is re

Re: Solr exceptions during batch indexing

2014-11-08 Thread Anurag Sharma
Just trying to understand what's the challenge in returning the bad doc id(s)? Solr already know which doc(s) failed on update and can return their id(s) in response or callback. Can we have JIRA ticket on it if it doesn't exist? This looks like a common use case and every solr consumer might be w

Re: Solr exceptions during batch indexing

2014-11-07 Thread Walter Underwood
Right, that is why we batch. When a batch of 1000 fails, drop to a batch size of 1 and start the batch over. Then it can report the exact document with problems. If you want to continue, go back to the bigger batch size. I usually fail the whole batch on one error. wunder Walter Underwood wun.

Re: Solr exceptions during batch indexing

2014-11-07 Thread Peter Keegan
I'm seeing 9X throughput with 1000 docs/batch vs 1 doc/batch, with a single thread, so it's certainly worth it. Thanks, Peter On Fri, Nov 7, 2014 at 2:18 PM, Erick Erickson wrote: > And Walter has also been around for a _long_ time ;) > > (sorry, couldn't resist) > > Erick > > On Fri, Nov

Re: Solr exceptions during batch indexing

2014-11-07 Thread Erick Erickson
And Walter has also been around for a _long_ time ;) (sorry, couldn't resist) Erick On Fri, Nov 7, 2014 at 11:12 AM, Walter Underwood wrote: > Yes, I implemented exactly that fallback for Solr 1.2 at Netflix. > > It isn’t to hard if the code is structured for it; retry with a batch size of

Re: Solr exceptions during batch indexing

2014-11-07 Thread Walter Underwood
Yes, I implemented exactly that fallback for Solr 1.2 at Netflix. It isn’t to hard if the code is structured for it; retry with a batch size of 1. wunder On Nov 7, 2014, at 11:01 AM, Erick Erickson wrote: > Yeah, this has been an ongoing issue for a _long_ time. Basically, > you can't. So far,

Re: Solr exceptions during batch indexing

2014-11-07 Thread Erick Erickson
Yeah, this has been an ongoing issue for a _long_ time. Basically, you can't. So far, people have essentially written fallback logic to index the docs of a failing packet one at a time and report it. I'd really like better reporting back, but we haven't gotten there yet. Best, Erick On Fri, Nov

Solr exceptions during batch indexing

2014-11-07 Thread Peter Keegan
How are folks handling Solr exceptions that occur during batch indexing? Solr stops parsing the docs stream when an error occurs (e.g. a doc with a missing mandatory field), and stops indexing the batch. The bad document is not identified, so it would be hard for the client to recover by skipping o