Thanks all, for your valuable ideas into this matter. I will try them. :)

Regards,
Dileepa


On Sun, Dec 1, 2013 at 6:05 PM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:

> There is no support for throttling built into DIH. You can probably write a
> Transformer which sleeps a while after every N requests to simulate
> throttling.
> On 26 Nov 2013 14:21, "Dileepa Jayakody" <dileepajayak...@gmail.com>
> wrote:
>
> > Hi All,
> >
> > I have a requirement to import a large amount of data from a mysql
> database
> > and index documents (about 1000 documents).
> > During indexing process I need to do a special processing of a field by
> > sending a enhancement requests to an external Apache Stanbol server.
> > I have configured my dataimport-handler in solrconfig.xml to use the
> > StanbolContentProcessor in the update chain, as below;
> >
> >  *<updateRequestProcessorChain name="stanbolInterceptor">*
> > * <processor
> > class="com.solr.stanbol.processor.StanbolContentProcessorFactory"/>*
> > *        <processor class="solr.RunUpdateProcessorFactory" />*
> > *  </updateRequestProcessorChain>*
> >
> > *  <requestHandler name="/dataimport" class="solr.DataImportHandler">   *
> > * <lst name="defaults">  *
> > * <str name="config">data-config.xml</str>*
> > * <str name="update.chain">stanbolInterceptor</str>*
> > * </lst> *
> > *   </requestHandler>*
> >
> > My sample data-config.xml is as below;
> >
> > *<dataConfig>*
> > *<dataSource type="JdbcDataSource" driver="com.mysql.jdbc.Driver"
> > url="jdbc:mysql://localhost:3306/solrTest" user="test" password="test123"
> > batchSize="1" />*
> > *    <document name="stanboldata">*
> > *        <entity name="stanbolrequest" query="SELECT * FROM documents">*
> > *            <field column="id" name="id" />*
> > *            <field column="content" name="content" />*
> > *     <field column="title" name="title" />*
> > *        </entity>*
> > *    </document>*
> > *</dataConfig>*
> >
> > When running a large import with about 1000 documents, my stanbol server
> > goes down, I suspect due to heavy load from the above Solr
> > Stanbolnterceptor.
> > I would like to throttle the dataimport in batches, so that Stanbol can
> > process a manageable number of requests concurrently.
> > Is this achievable using batchSize parameter in dataSource element in the
> > data-config?
> > Can someone please give some ideas to throttle the dataimport load in
> Solr?
> >
> > Thanks,
> > Dileepa
> >
>

Reply via email to