One DataImportHandler <requestHandler> entry is a single-threaded job.
Always.

You can make two <requestHandler> entries with the same config file and use
parameter variables to choose which entity to run. They will not share any
cached data like the SQL caching tool.

On Fri, May 18, 2012 at 8:14 AM, Sergio Martín Cantero <
sergio.mar...@playence.com> wrote:

>  Yes, James, I have tried it myself.
>
> http://localhost:8080/solr/dataimportUsers?command=full-import&entity=users&entity=products
>
> Will run secuentially, first entity users and once it has finished, entity
> products.
>
> The problem with that approach is that you are not able to lauch the load
> of a third entity while those are loading.
>
> Let´s say you have 2 entities:
> - users (takes 30 minutes to finish)
> - products (takes 1 hour to finish)
>
> You cuold want to launch users indexing every 1 hour and product every 15
> minutes. With the approach of command=full-import&entity=users&entity=products
> that´s not possible, while having two repeated dataimporters makes it
> possible.
>
> Again, thanks a lot for your help.
>
>   [image: www.playence.com] <http://www.playence.com>
>      Sergio Martín Cantero
>  Office (ES) +34 91 733 73 97
>   playence Spain SL sergio.mar...@playence.com
> Calle Vicente Gaceo 19
>
>   28029 Madrid - España
>
>
> El 18/05/12 17:04, Dyer, James escribió:
>
>  I don't see a drawback with the 2-handler approach you show.  I think
> this is the best way to have parallel processing with DIH currently.****
>
> ** **
>
> My understanding of the wiki (
> http://wiki.apache.org/solr/DataImportHandler#Commands) is that it seems
> to say that it will run "users" and "products" at the same time with a
> single handler if you do something like this:****
>
> ** **
>
>
> http://localhost:8080/solr/dataimportUsers?command=full-import&entity=users&entity=products
> ****
>
> ** **
>
> But to me this feature seems obscure at best and like the "threads"
> feature, I would imagine, it would be buggy.  I was wondering if anyone
> out there is using it with success, or if perhaps the wiki is in error on
> this point.****
>
> ** **
>
> *James Dyer*
>
> E-Commerce Systems****
>
> Ingram Content Group****
>
> (615) 213-4311****
>
> ** **
>
> *From:* Sergio Martín Cantero 
> [mailto:sergio.mar...@playence.com<sergio.mar...@playence.com>]
>
> *Sent:* Friday, May 18, 2012 9:33 AM
> *To:* Dyer, James; solr-user@lucene.apache.org
>
> *Subject:* Re: Use DIH with more than one entity at the same time****
>
>  ** **
>
> I see.
>
> What I need is not multiple threads for one entity but multiple entities
> at the same time.
>
> What I have done is rename the DIH for each of the entities in solrconfig,
> altough the are using the same data-import-confg.xml.
> Something like:
>   <!-- Used for simultaneous full-import with various entities -->
>   <requestHandler name="/dataimportUsers"
> class="org.apache.solr.handler.dataimport.DataImportHandler">
>     <lst name="defaults">
>       <str name="config">data-import-config.xml</str>
>     </lst>
>   </requestHandler>
>   <!-- Used for simultaneous full-import with various entities -->
>   <requestHandler name="/dataimportProducts"
> class="org.apache.solr.handler.dataimport.DataImportHandler">
>     <lst name="defaults">
>       <str name="config">data-import-config.xml</str>
>     </lst>
>   </requestHandler>
>
> Then I can run each entity at the same time with:
> http://localhost:8080/solr/dataimportUsers?command=full-import&entity=users
>
> http://localhost:8080/solr/dataimportProducts?command=full-import&entity=products
>
> Being users and products entities defined in the same
> data-import-config.xml
>
> This way, I don´t need to wait  to run products until users has finished.
> This allows me to call full-import for users lets say each 15 min and for
> products each 10 min, and don´t need to wait until one has finsihed. Both
> can be overlaping.
>
> Any drawback to this approach?
>
> Thanks!!
>
> Sergio
>
> El 18/05/12 16:21, Dyer, James escribió: ****
>
> "threads" lets you run a single entity with multiple threads, so tis
> probably not what you wanted.  What we've done here is partition the
> source data and then we have multiple handlers running at the same time,
> each processing its own partition.  So we multi-thread the import without
> using the "threads" parameter.  ****
>
>  ****
>
> Even if this sounds like something useful, I recommend against using it.  
> "threads"
> has tons of bugs, although some fixes were made for Solr 3.6.  For Solr
> 4.0 this feature is removed.****
>
>  ****
>
> *James Dyer*****
>
> E-Commerce Systems****
>
> Ingram Content Group****
>
> (615) 213-4311****
>
>  ****
>
> *From:* Sergio Martín Cantero 
> [mailto:sergio.mar...@playence.com<sergio.mar...@playence.com>]
>
> *Sent:* Friday, May 18, 2012 6:23 AM
> *To:* solr-user@lucene.apache.org
> *Cc:* Dyer, James
>
> *Subject:* Re: Use DIH with more than one entity at the same time
> ****
>
>   ****
>
> What the wiki indicates actually works, altough it´s not what I wanted. I
> have tried it and works fine.
>
>
> I have also tried Jack´s approach and also works fine (and is what I was
> looking for :-)
>
> Still, I have one more question. You wrote: " This is a 1.4.1
> installation, back when there was no "threads" option in DIH. ". I´m using
> 3.5 Solr. What would the use of threads change. How could I take advantage
> ot it, instead of declaring various DIHs in SolrConfgi.xml?
>
> Thanks a lot!
>
>
> El 17/05/12 18:33, Dyer, James escribió:
> ****
>
> The wiki here indicates that you can specify "entity" more than once on the 
> request and it will run multiple entities at the same time, in the same 
> handler:  http://wiki.apache.org/solr/DataImportHandler#Commands****
>
>  ****
>
> But I can't say for sure that this actually works!  Having been in the DIH 
> code, I would think such a feature is buggy at best, if it works at all.  But 
> if you try it let us know how it works for you.  Also, if anyone else out 
> there is using multiple "entity" parameters to get entities running in 
> parallel, I'd be interested in hearing about it.****
>
>  ****
>
> But the approach taken in the link Jack sites below does work.  Its a pain to 
> set it up though.****
>
>  ****
>
> James Dyer****
>
> E-Commerce Systems****
>
> Ingram Content Group****
>
> (615) 213-4311****
>
>  ****
>
> From: Jack Krupansky [mailto:j...@basetechnology.com 
> <j...@basetechnology.com>]****
>
> Sent: Thursday, May 17, 2012 10:21 AM****
>
> To: solr-user@lucene.apache.org****
>
> Subject: Re: Use DIH with more than one entity at the same time****
>
>  ****
>
> Okay, the answer is “Yes, sort of, but...”****
>
>  ****
>
> “One annoyance is because of how DIH is designed, you need a separate handler 
> set up in solrconfig.xml for each DIH you plan to run.  So you have to plan 
> in advance how many DIH instances you want to run, which config files they'll 
> use, etc.”****
>
>  ****
>
> See:****
>
> http://lucene.472066.n3.nabble.com/Multiple-dataimport-processes-to-same-core-td3645525.html****
>
>  ****
>
> -- Jack Krupansky****
>
>  ****
>
> From: Sergio Martín Cantero<mailto:sergio.mar...@playence.com> 
> <sergio.mar...@playence.com>****
>
> Sent: Thursday, May 17, 2012 11:07 AM****
>
> To: solr-user@lucene.apache.org<mailto:solr-user@lucene.apache.org> 
> <solr-user@lucene.apache.org>****
>
> Cc: Jack Krupansky<mailto:j...@basetechnology.com> 
> <j...@basetechnology.com>****
>
> Subject: Re: Use DIH with more than one entity at the same time****
>
>  ****
>
> Thanks Jack, but that´s not what I want.****
>
>  ****
>
> I don´t want multiple entities in one invocation, but two simultaneous 
> invocations of the DIH with different entities.****
>
>  ****
>
> Thanks.****
>
> [cid:B1C89B4707D142DCB6BFBD6B07E47BC7@JackKrupansky]<http://www.playence.com> 
> <http://www.playence.com>****
>
> [cid:3F3E4BE8DC9D4B808C9038D507DE8415@JackKrupansky]****
>
> Sergio Martín Cantero****
>
>  ****
>
> Office (ES) +34 91 733 73 97****
>
>  ****
>
> playence Spain SL****
>
>  ****
>
> sergio.mar...@playence.com<mailto:sergio.mar...@playence.com> 
> <sergio.mar...@playence.com>****
>
>  ****
>
> Calle Vicente Gaceo 19****
>
>  ****
>
> 28029 Madrid - España****
>
>  ****
>
>  ****
>
>  ****
>
>  ****
>
> El 17/05/12 17:04, Jack Krupansky escribió:****
>
> Yes. From the doc:****
>
>  ****
>
> "Multiple 'entity' parameters can be passed on to run multiple entities at 
> once. If nothing is passed, all entities are executed."****
>
>  ****
>
> See:****
>
> http://wiki.apache.org/solr/DataImportHandler****
>
>  ****
>
> But that is one invocation of DIH, not two separate updates as you tried.****
>
>  ****
>
> -- Jack Krupansky****
>
>  ****
>
> -----Original Message----- From: Sergio Martín Cantero****
>
> Sent: Thursday, May 17, 2012 10:46 AM****
>
> To: solr-user@lucene.apache.org<mailto:solr-user@lucene.apache.org> 
> <solr-user@lucene.apache.org>****
>
> Subject: Use DIH with more than one entity at the same time****
>
>  ****
>
> I´m new to this list, so... Hello everybody.****
>
>  ****
>
> I´m trying to run the DIH with more than one entity at the same time,****
>
> but only the first entity I call is being indexed. The other doesn´t get****
>
> any response.****
>
> For example:****
>
> First call:****
>
> http://localhost:8080/solr/dataimport?command=full-import&clean=false&entity=users****
>
> Before the indexing has finished, I call:****
>
> http://localhost:8080/solr/dataimport?command=full-import&clean=false&entity=products****
>
>  ****
>
> The second call doesn´t have any effedt, and the products are not****
>
> indexed at all.****
>
>  ****
>
> Isn´t it possible to run more than one full import for different****
>
> entities at the same time?****
>
>  ****
>
> Thanks a lot for your help****
>
> Sergio****
>
>


-- 
Lance Norskog
goks...@gmail.com

Reply via email to