I _thought_ you'd been around long enough to know about the options I
mentioned ;).

Right. I'd guess you're in UpdateHandler.addDoc and there's really no
batching at that level that I know of. I'm pretty sure that even
indexing batches of 1,000 documents from, say, SolrJ go through this
method.

I don't think there's much to be gained by any batching at this level,
it pretty immediately tells Lucene to index the doc.

FWIW
Erick

On Thu, Nov 3, 2016 at 11:10 AM, Markus Jelsma
<markus.jel...@openindex.io> wrote:
> Erick - in this case data can come from anywhere. There is one piece of code 
> all incoming documents, regardless of their origin, are passed thru, the 
> update handler and update processors of Solr.
>
> In my case that is the most convenient point to partially modify the 
> documents, instead of moving that logic to separate places.
>
> I've seen the ContentStream in SolrQueryResponse and i probably could tear 
> incoming data apart and put it back together again, but that would not be so 
> easy as working with already deserialized objects such as SolrInputDocument.
>
> UpdateHandler doesn't seem to work on a list of documents, it looked like it 
> works on incoming stuff, not a whole list. I've also looked if i could buffer 
> a batch in UpdateProcessor, work on them, and release them, but that seems 
> impossible.
>
> Thanks,
> Markus
>
> -----Original message-----
>> From:Erick Erickson <erickerick...@gmail.com>
>> Sent: Thursday 3rd November 2016 18:57
>> To: solr-user <solr-user@lucene.apache.org>
>> Subject: Re: UpdateProcessor as a batch
>>
>> Markus:
>>
>> How are you indexing? SolrJ has a client.add(List<SolrInputDocument>)
>> form, and post.jar lets you add as many documents as you want in a
>> batch....
>>
>> Best,
>> Erick
>>
>> On Thu, Nov 3, 2016 at 10:18 AM, Markus Jelsma
>> <markus.jel...@openindex.io> wrote:
>> > Hi - i need to process a batch of documents on update but i cannot seem to 
>> > find a point where i can hook in and process a list of SolrInputDocuments, 
>> > not in UpdateProcessor nor in UpdateHandler.
>> >
>> > For now i let it go and implemented it on a per-document basis, it is 
>> > fast, but i'd prefer batches. Is that possible at all?
>> >
>> > Thanks,
>> > Markus
>>

Reply via email to