I'm looking to index data in Solr using a PHP page feeding the index.
In my application I have all docs allready "converted" to a solr/add xml 
document and I need to make solr able to get all changed documents into the 
index. Looking at DIH I decidec to use URLDataSource and useSolrAddSchema=true 
pointing to my application url: getchangeddocstoindex.php.

But my PHP page could stream hundreds of megabytes (maybe couple of Gigs!).
Anybody knows if do I need to adapt connectionTimeout and readTimeout in any 
way? 

Looking at URLDataSource documentation it seems that It's possible to 
implement a kind of chunking using Transformer and  $hasMore and $nextURL.

But having useSolrAddSchema I don't know how to setup a Transformer section.

My questions are:
1) Does exist any limit over that it's better to do chunking?
2) It's possible to do chunking having useSolrAddSchema=true?

Thanks


Dario.

Reply via email to