Re: Importing large datasets

2010-06-07 Thread Alexey Serba
could potentially be >>> used. Not >>> sure if that would help or not. >>> >>> I might as well that the item description will have indexed, stored >>> and term >>> vectors set to true. >>> -- >>> View this message in context

Re: Importing large datasets

2010-06-03 Thread Grant Ingersoll
On Jun 2, 2010, at 10:30 PM, Blargy wrote: > Whats more efficient a batch size of 1000 or -1 for MySQL? Is this why its > so slow because I am using 2 different datasources? > By batch size, I meant the number of docs sent from the client to Solr. MySQL Batch Size is broken. The only thing th

Re: Importing large datasets

2010-06-03 Thread Erik Hatcher
w how long that takes i.e. Creating a SQL script that just pulls that data out how long does that take? Also have many fields are you indexing per document 10,50,100? -- View this message in context: http://lucene.472066.n3.nabble.com/Importing-large-datasets-f tp863447p866538.html Sent fro

Re: Importing large datasets

2010-06-02 Thread David Stuart
? Also have many fields are you indexing per document 10,50,100? -- View this message in context: http://lucene.472066.n3.nabble.com/Importing-large-datasets-f tp863447p866538.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Importing large datasets

2010-06-02 Thread David Stuart
wrote: From: Grant Ingersoll Subject: Re: Importing large datasets To: solr-user@lucene.apache.org Date: Wednesday, June 2, 2010, 3:42 AM On Jun 1, 2010, at 9:54 PM, Blargy wrote: We have around 5 million items in our index and each item has a description located on a separate physical

Re: Importing large datasets

2010-06-02 Thread David Stuart
w.yert.com/film.php --- On Wed, 6/2/10, Andrzej Bialecki wrote: From: Andrzej Bialecki Subject: Re: Importing large datasets To: solr-user@lucene.apache.org Date: Wednesday, June 2, 2010, 4:52 AM On 2010-06-02 13:12, Grant Ingersoll wrote: On Jun 2, 2010, at 6:53 AM, Andrzej Bialecki wrote:

Re: Importing large datasets

2010-06-02 Thread Blargy
Would dumping the databases to a local file help at all? -- View this message in context: http://lucene.472066.n3.nabble.com/Importing-large-datasets-tp863447p866538.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Importing large datasets

2010-06-02 Thread Blargy
; >> I might as well that the item description will have indexed, stored >> and term >> vectors set to true. >> -- >> View this message in context: >> http://lucene.472066.n3.nabble.com/Importing-large-datasets-tp863447p865219.html >> Sent from the Solr

Re: Importing large datasets

2010-06-02 Thread Blargy
the full process takes 4 hours which is bearable. >>> However >>> once I started to import the item description which is located on a >>> separate >>> machine/database the import process exploded to over 24 hours. >>> >>> -- >>> View thi

Re: Importing large datasets

2010-06-02 Thread Dennis Gearon
http://www.yert.com/film.php --- On Wed, 6/2/10, David Stuart wrote: > From: David Stuart > Subject: Re: Importing large datasets > To: "solr-user@lucene.apache.org" > Date: Wednesday, June 2, 2010, 12:00 PM > How long does it take to do a grab of > all the data via

Re: Importing large datasets

2010-06-02 Thread Dennis Gearon
gh at http://www.yert.com/film.php --- On Wed, 6/2/10, Andrzej Bialecki wrote: > From: Andrzej Bialecki > Subject: Re: Importing large datasets > To: solr-user@lucene.apache.org > Date: Wednesday, June 2, 2010, 4:52 AM > On 2010-06-02 13:12, Grant Ingersoll > wrote: > > >

Re: Importing large datasets

2010-06-02 Thread Dennis Gearon
e: > From: Grant Ingersoll > Subject: Re: Importing large datasets > To: solr-user@lucene.apache.org > Date: Wednesday, June 2, 2010, 3:42 AM > > On Jun 1, 2010, at 9:54 PM, Blargy wrote: > > > > > We have around 5 million items in our index and each > it

Re: Importing large datasets

2010-06-02 Thread Lance Norskog
t; >> >> Not sure how much that would help. As I mentioned that without the item >> description import the full process takes 4 hours which is bearable. >> However >> once I started to import the item description which is located on a >> separate >> machi

Re: Importing large datasets

2010-06-02 Thread David Stuart
nabble.com/Importing-large-datasets-tp863447p865324.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Importing large datasets

2010-06-02 Thread Blargy
l process takes 4 hours which is bearable. However once I started to import the item description which is located on a separate machine/database the import process exploded to over 24 hours. -- View this message in context: http://lucene.472066.n3.nabble.com/Importing-large-datasets-tp863447p8653

Re: Importing large datasets

2010-06-02 Thread Erik Hatcher
used. Not sure if that would help or not. I might as well that the item description will have indexed, stored and term vectors set to true. -- View this message in context: http://lucene.472066.n3.nabble.com/Importing-large-datasets-tp863447p865219.html Sent from the Solr - User mailing

Re: Importing large datasets

2010-06-02 Thread Blargy
has a pool of 2 DB's that could potentially be used. Not sure if that would help or not. I might as well that the item description will have indexed, stored and term vectors set to true. -- View this message in context: http://lucene.472066.n3.nabble.com/Importing-large-datasets-tp863447p8

Re: Importing large datasets

2010-06-02 Thread Blargy
System Integration > http://www.sigram.com Contact: info at sigram dot com > > > I haven't worked with Hadoop before but I'm willing to try anything to cut down this full import time. I see this currently uses the embedded solr server for indexing... would I have to scrap my DIH importing then? -- View this message in context: http://lucene.472066.n3.nabble.com/Importing-large-datasets-tp863447p865103.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Importing large datasets

2010-06-02 Thread Blargy
scription. Maybe its not the best? Here is an overview of my data-config.xml. Thoughts? ... I appreciate the help. -- View this message in context: http://lucene.472066.n3.nabble.com/Importing-large-datasets-tp863447p865091.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Importing large datasets

2010-06-02 Thread Andrzej Bialecki
On 2010-06-02 13:12, Grant Ingersoll wrote: > > On Jun 2, 2010, at 6:53 AM, Andrzej Bialecki wrote: > >> On 2010-06-02 12:42, Grant Ingersoll wrote: >>> >>> On Jun 1, 2010, at 9:54 PM, Blargy wrote: >>> We have around 5 million items in our index and each item has a description loc

Re: Importing large datasets

2010-06-02 Thread Grant Ingersoll
On Jun 2, 2010, at 6:53 AM, Andrzej Bialecki wrote: > On 2010-06-02 12:42, Grant Ingersoll wrote: >> >> On Jun 1, 2010, at 9:54 PM, Blargy wrote: >> >>> >>> We have around 5 million items in our index and each item has a description >>> located on a separate physical database. These item descr

Re: Importing large datasets

2010-06-02 Thread Andrzej Bialecki
On 2010-06-02 12:42, Grant Ingersoll wrote: > > On Jun 1, 2010, at 9:54 PM, Blargy wrote: > >> >> We have around 5 million items in our index and each item has a description >> located on a separate physical database. These item descriptions vary in >> size and for the most part are quite large.

Re: Importing large datasets

2010-06-02 Thread Grant Ingersoll
On Jun 1, 2010, at 9:54 PM, Blargy wrote: > > We have around 5 million items in our index and each item has a description > located on a separate physical database. These item descriptions vary in > size and for the most part are quite large. Currently we are only indexing > items and not their

Importing large datasets

2010-06-01 Thread Blargy
would one speed up this process? Is there a way to run parallel import processes and then merge them together at the end? Possibly use some sort of distributed computing? Any ideas. Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Importing-large-datasets-tp863447p863447