could potentially be
>>> used. Not
>>> sure if that would help or not.
>>>
>>> I might as well that the item description will have indexed, stored
>>> and term
>>> vectors set to true.
>>> --
>>> View this message in context
On Jun 2, 2010, at 10:30 PM, Blargy wrote:
> Whats more efficient a batch size of 1000 or -1 for MySQL? Is this why its
> so slow because I am using 2 different datasources?
>
By batch size, I meant the number of docs sent from the client to Solr. MySQL
Batch Size is broken. The only thing th
w how long that takes i.e. Creating a SQL script
that just pulls that data out how long does that take?
Also have many fields are you indexing per document 10,50,100?
-- View this message in context: http://lucene.472066.n3.nabble.com/Importing-large-datasets-f
tp863447p866538.html
Sent fro
?
Also have many fields are you indexing per document 10,50,100?
--
View this message in context: http://lucene.472066.n3.nabble.com/Importing-large-datasets-f
tp863447p866538.html
Sent from the Solr - User mailing list archive at Nabble.com.
wrote:
From: Grant Ingersoll
Subject: Re: Importing large datasets
To: solr-user@lucene.apache.org
Date: Wednesday, June 2, 2010, 3:42 AM
On Jun 1, 2010, at 9:54 PM, Blargy wrote:
We have around 5 million items in our index and each
item has a description
located on a separate physical
w.yert.com/film.php
--- On Wed, 6/2/10, Andrzej Bialecki wrote:
From: Andrzej Bialecki
Subject: Re: Importing large datasets
To: solr-user@lucene.apache.org
Date: Wednesday, June 2, 2010, 4:52 AM
On 2010-06-02 13:12, Grant Ingersoll
wrote:
On Jun 2, 2010, at 6:53 AM, Andrzej Bialecki wrote:
Would dumping the databases to a local file help at all?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Importing-large-datasets-tp863447p866538.html
Sent from the Solr - User mailing list archive at Nabble.com.
;
>> I might as well that the item description will have indexed, stored
>> and term
>> vectors set to true.
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Importing-large-datasets-tp863447p865219.html
>> Sent from the Solr
the full process takes 4 hours which is bearable.
>>> However
>>> once I started to import the item description which is located on a
>>> separate
>>> machine/database the import process exploded to over 24 hours.
>>>
>>> --
>>> View thi
http://www.yert.com/film.php
--- On Wed, 6/2/10, David Stuart wrote:
> From: David Stuart
> Subject: Re: Importing large datasets
> To: "solr-user@lucene.apache.org"
> Date: Wednesday, June 2, 2010, 12:00 PM
> How long does it take to do a grab of
> all the data via
gh at http://www.yert.com/film.php
--- On Wed, 6/2/10, Andrzej Bialecki wrote:
> From: Andrzej Bialecki
> Subject: Re: Importing large datasets
> To: solr-user@lucene.apache.org
> Date: Wednesday, June 2, 2010, 4:52 AM
> On 2010-06-02 13:12, Grant Ingersoll
> wrote:
> >
>
e:
> From: Grant Ingersoll
> Subject: Re: Importing large datasets
> To: solr-user@lucene.apache.org
> Date: Wednesday, June 2, 2010, 3:42 AM
>
> On Jun 1, 2010, at 9:54 PM, Blargy wrote:
>
> >
> > We have around 5 million items in our index and each
> it
t;
>>
>> Not sure how much that would help. As I mentioned that without the item
>> description import the full process takes 4 hours which is bearable.
>> However
>> once I started to import the item description which is located on a
>> separate
>> machi
nabble.com/Importing-large-datasets-tp863447p865324.html
Sent from the Solr - User mailing list archive at Nabble.com.
l process takes 4 hours which is bearable. However
once I started to import the item description which is located on a separate
machine/database the import process exploded to over 24 hours.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Importing-large-datasets-tp863447p8653
used. Not
sure if that would help or not.
I might as well that the item description will have indexed, stored
and term
vectors set to true.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Importing-large-datasets-tp863447p865219.html
Sent from the Solr - User mailing
has a pool of 2 DB's that could potentially be used. Not
sure if that would help or not.
I might as well that the item description will have indexed, stored and term
vectors set to true.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Importing-large-datasets-tp863447p8
System Integration
> http://www.sigram.com Contact: info at sigram dot com
>
>
>
I haven't worked with Hadoop before but I'm willing to try anything to cut
down this full import time. I see this currently uses the embedded solr
server for indexing... would I have to scrap my DIH importing then?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Importing-large-datasets-tp863447p865103.html
Sent from the Solr - User mailing list archive at Nabble.com.
scription. Maybe its not the
best?
Here is an overview of my data-config.xml. Thoughts?
...
I appreciate the help.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Importing-large-datasets-tp863447p865091.html
Sent from the Solr - User mailing list archive at Nabble.com.
On 2010-06-02 13:12, Grant Ingersoll wrote:
>
> On Jun 2, 2010, at 6:53 AM, Andrzej Bialecki wrote:
>
>> On 2010-06-02 12:42, Grant Ingersoll wrote:
>>>
>>> On Jun 1, 2010, at 9:54 PM, Blargy wrote:
>>>
We have around 5 million items in our index and each item has a description
loc
On Jun 2, 2010, at 6:53 AM, Andrzej Bialecki wrote:
> On 2010-06-02 12:42, Grant Ingersoll wrote:
>>
>> On Jun 1, 2010, at 9:54 PM, Blargy wrote:
>>
>>>
>>> We have around 5 million items in our index and each item has a description
>>> located on a separate physical database. These item descr
On 2010-06-02 12:42, Grant Ingersoll wrote:
>
> On Jun 1, 2010, at 9:54 PM, Blargy wrote:
>
>>
>> We have around 5 million items in our index and each item has a description
>> located on a separate physical database. These item descriptions vary in
>> size and for the most part are quite large.
On Jun 1, 2010, at 9:54 PM, Blargy wrote:
>
> We have around 5 million items in our index and each item has a description
> located on a separate physical database. These item descriptions vary in
> size and for the most part are quite large. Currently we are only indexing
> items and not their
would one speed up this process? Is there a way to run
parallel import processes and then merge them together at the end? Possibly
use some sort of distributed computing?
Any ideas. Thanks
--
View this message in context:
http://lucene.472066.n3.nabble.com/Importing-large-datasets-tp863447p863447
24 matches
Mail list logo