I see.
What I need is not multiple threads for one entity but multiple entities
at the same time.
What I have done is rename the DIH for each of the entities in
solrconfig, altough the are using the same data-import-confg.xml.
Something like:
<!-- Used for simultaneous full-import with various entities -->
<requestHandler name="/dataimportUsers"
class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">data-import-config.xml</str>
</lst>
</requestHandler>
<!-- Used for simultaneous full-import with various entities -->
<requestHandler name="/dataimportProducts"
class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">data-import-config.xml</str>
</lst>
</requestHandler>
Then I can run each entity at the same time with:
http://localhost:8080/solr/dataimportUsers?command=full-import&entity=users
http://localhost:8080/solr/dataimportProducts?command=full-import&entity=products
Being users and products entities defined in the same data-import-config.xml
This way, I don´t need to wait to run products until users has finished.
This allows me to call full-import for users lets say each 15 min and
for products each 10 min, and don´t need to wait until one has finsihed.
Both can be overlaping.
Any drawback to this approach?
Thanks!!
Sergio
El 18/05/12 16:21, Dyer, James escribió:
"threads" lets you run a single entity with multiple threads, so tis
probably not what you wanted.What we've done here is partition the
source data and then we have multiple handlers running at the same
time, each processing its own partition.So we multi-thread the import
without using the "threads" parameter.
Even if this sounds like something useful, I recommend against using
it."threads" has tons of bugs, although some fixes were made for Solr
3.6.For Solr 4.0 this feature is removed.
*James Dyer*
E-Commerce Systems
Ingram Content Group
(615) 213-4311
*From:*Sergio Martín Cantero [mailto:sergio.mar...@playence.com]
*Sent:* Friday, May 18, 2012 6:23 AM
*To:* solr-user@lucene.apache.org
*Cc:* Dyer, James
*Subject:* Re: Use DIH with more than one entity at the same time
What the wiki indicates actually works, altough it´s not what I
wanted. I have tried it and works fine.
I have also tried Jack´s approach and also works fine (and is what I
was looking for :-)
Still, I have one more question. You wrote: " This is a 1.4.1
installation, back when there was no "threads" option in DIH. ". I´m
using 3.5 Solr. What would the use of threads change. How could I take
advantage ot it, instead of declaring various DIHs in SolrConfgi.xml?
Thanks a lot!
El 17/05/12 18:33, Dyer, James escribió:
The wiki here indicates that you can specify "entity" more than once on the
request and it will run multiple entities at the same time, in the same handler:
http://wiki.apache.org/solr/DataImportHandler#Commands
But I can't say for sure that this actually works! Having been in the DIH code, I would think such a feature is buggy at best, if it works at all. But if you try it let us know how it works for you. Also, if anyone else out there is using multiple "entity" parameters to get entities running in parallel, I'd be interested in hearing about it.
But the approach taken in the link Jack sites below does work. Its a pain to set it up though.
James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311
From: Jack Krupansky [mailto:j...@basetechnology.com]
Sent: Thursday, May 17, 2012 10:21 AM
To:solr-user@lucene.apache.org <mailto:solr-user@lucene.apache.org>
Subject: Re: Use DIH with more than one entity at the same time
Okay, the answer is “Yes, sort of, but...”
“One annoyance is because of how DIH is designed, you need a separate handler set up in solrconfig.xml for each DIH you plan to run. So you have to plan in advance how many DIH instances you want to run, which config files they'll use, etc.”
See:
http://lucene.472066.n3.nabble.com/Multiple-dataimport-processes-to-same-core-td3645525.html
-- Jack Krupansky
From: Sergio Martín Cantero<mailto:sergio.mar...@playence.com>
Sent: Thursday, May 17, 2012 11:07 AM
To:solr-user@lucene.apache.org
<mailto:solr-user@lucene.apache.org><mailto:solr-user@lucene.apache.org>
Cc: Jack Krupansky<mailto:j...@basetechnology.com>
Subject: Re: Use DIH with more than one entity at the same time
Thanks Jack, but that´s not what I want.
I don´t want multiple entities in one invocation, but two simultaneous invocations of the DIH with different entities.
Thanks.
[cid:B1C89B4707D142DCB6BFBD6B07E47BC7@JackKrupansky]<http://www.playence.com>
[cid:3F3E4BE8DC9D4B808C9038D507DE8415@JackKrupansky]
Sergio Martín Cantero
Office (ES) +34 91 733 73 97
playence Spain SL
sergio.mar...@playence.com <mailto:sergio.mar...@playence.com><mailto:sergio.mar...@playence.com>
Calle Vicente Gaceo 19
28029 Madrid - España
El 17/05/12 17:04, Jack Krupansky escribió:
Yes. From the doc:
"Multiple 'entity' parameters can be passed on to run multiple entities at once. If nothing is passed, all entities are executed."
See:
http://wiki.apache.org/solr/DataImportHandler
But that is one invocation of DIH, not two separate updates as you tried.
-- Jack Krupansky
-----Original Message----- From: Sergio Martín Cantero
Sent: Thursday, May 17, 2012 10:46 AM
To:solr-user@lucene.apache.org
<mailto:solr-user@lucene.apache.org><mailto:solr-user@lucene.apache.org>
Subject: Use DIH with more than one entity at the same time
I´m new to this list, so... Hello everybody.
I´m trying to run the DIH with more than one entity at the same time,
but only the first entity I call is being indexed. The other doesn´t get
any response.
For example:
First call:
http://localhost:8080/solr/dataimport?command=full-import&clean=false&entity=users
<http://localhost:8080/solr/dataimport?command=full-import&clean=false&entity=users>
Before the indexing has finished, I call:
http://localhost:8080/solr/dataimport?command=full-import&clean=false&entity=products
<http://localhost:8080/solr/dataimport?command=full-import&clean=false&entity=products>
The second call doesn´t have any effedt, and the products are not
indexed at all.
Isn´t it possible to run more than one full import for different
entities at the same time?
Thanks a lot for your help
Sergio