Thanks all, for your valuable ideas into this matter. I will try them. :) Regards, Dileepa
On Sun, Dec 1, 2013 at 6:05 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > There is no support for throttling built into DIH. You can probably write a > Transformer which sleeps a while after every N requests to simulate > throttling. > On 26 Nov 2013 14:21, "Dileepa Jayakody" <dileepajayak...@gmail.com> > wrote: > > > Hi All, > > > > I have a requirement to import a large amount of data from a mysql > database > > and index documents (about 1000 documents). > > During indexing process I need to do a special processing of a field by > > sending a enhancement requests to an external Apache Stanbol server. > > I have configured my dataimport-handler in solrconfig.xml to use the > > StanbolContentProcessor in the update chain, as below; > > > > *<updateRequestProcessorChain name="stanbolInterceptor">* > > * <processor > > class="com.solr.stanbol.processor.StanbolContentProcessorFactory"/>* > > * <processor class="solr.RunUpdateProcessorFactory" />* > > * </updateRequestProcessorChain>* > > > > * <requestHandler name="/dataimport" class="solr.DataImportHandler"> * > > * <lst name="defaults"> * > > * <str name="config">data-config.xml</str>* > > * <str name="update.chain">stanbolInterceptor</str>* > > * </lst> * > > * </requestHandler>* > > > > My sample data-config.xml is as below; > > > > *<dataConfig>* > > *<dataSource type="JdbcDataSource" driver="com.mysql.jdbc.Driver" > > url="jdbc:mysql://localhost:3306/solrTest" user="test" password="test123" > > batchSize="1" />* > > * <document name="stanboldata">* > > * <entity name="stanbolrequest" query="SELECT * FROM documents">* > > * <field column="id" name="id" />* > > * <field column="content" name="content" />* > > * <field column="title" name="title" />* > > * </entity>* > > * </document>* > > *</dataConfig>* > > > > When running a large import with about 1000 documents, my stanbol server > > goes down, I suspect due to heavy load from the above Solr > > Stanbolnterceptor. > > I would like to throttle the dataimport in batches, so that Stanbol can > > process a manageable number of requests concurrently. > > Is this achievable using batchSize parameter in dataSource element in the > > data-config? > > Can someone please give some ideas to throttle the dataimport load in > Solr? > > > > Thanks, > > Dileepa > > >