On 8/12/2011 3:32 PM, Eric Myers wrote:
Recently started looking into solr to solve a problem created before my
time. We have a dataset consisting of 390,000,000+ records that had a
search written for it using a simple query. The problem is that the
dataset needs additional indices to keep oper
We have a 200,000,000 record index with 14 fields, and we can re-index
the entire data set in about five hours. One thing to note is that the
DataImportHandler uses one thread per entity by default. If you have a
multcore box, you can drastically speed indexing by specifying a
threadcount of n+1, w
Recently started looking into solr to solve a problem created before my
time. We have a dataset consisting of 390,000,000+ records that had a
search written for it using a simple query. The problem is that the
dataset needs additional indices to keep operating. The DBA says no go,
too large a da
On Sat, Dec 13, 2008 at 11:45 AM, Kay Kay wrote:
> True - Currently , playing around with mysql . But I was trying to
> understand more about how the Statement object is getting created (in the
> case of a platform / vendor specific query like this ). Are we going through
> JPA internally in Solr
Shalin Shekhar Mangar wrote:
On Sat, Dec 13, 2008 at 11:03 AM, Kay Kay wrote:
Thanks Shalin for the clarification.
The case about Lucene taking more time to index the Document when compared
to DataImportHandler creating the input is definitely intuitive.
But just curious about the underly
On Sat, Dec 13, 2008 at 11:03 AM, Kay Kay wrote:
> Thanks Shalin for the clarification.
>
> The case about Lucene taking more time to index the Document when compared
> to DataImportHandler creating the input is definitely intuitive.
>
> But just curious about the underlying architecture on which
Thanks Shalin for the clarification.
The case about Lucene taking more time to index the Document when
compared to DataImportHandler creating the input is definitely intuitive.
But just curious about the underlying architecture on which the test was
being run. Was this performed on a multi-co
On Sat, Dec 13, 2008 at 4:51 AM, Kay Kay wrote:
> Thanks Bryan .
>
> That clarifies a lot.
>
> But even with streaming - retrieving one document at a time and adding to
> the IndexWriter seems to making it more serializable .
>
We have experimented with making DataImportHandler multi-threaded in
: Bryan Talbot
Subject: Re: Solr - DataImportHandler - Large Dataset results ?
To: solr-user@lucene.apache.org
Date: Friday, December 12, 2008, 5:26 PM
It only supports streaming if properly enabled which is completely lame:
http://dev.mysql.com/doc/refman/5.0/en/connector-j-reference-implementation
Subject: Re: Solr - DataImportHandler - Large Dataset results ?
To: solr-user@lucene.apache.org
Date: Friday, December 12, 2008, 9:41 PM
DataImportHandler is designed to stream rows one by one to create Solr
documents. As long as your database driver supports streaming, you
should be
fine. Which
Mangar
Subject: Re: Solr - DataImportHandler - Large Dataset results ?
To: solr-user@lucene.apache.org
Date: Friday, December 12, 2008, 9:41 PM
DataImportHandler is designed to stream rows one by one to create Solr
documents. As long as your database driver supports streaming, you should be
fine
DataImportHandler is designed to stream rows one by one to create Solr
documents. As long as your database driver supports streaming, you should be
fine. Which database are you using?
On Sat, Dec 13, 2008 at 2:20 AM, Kay Kay wrote:
> As per the example in the wiki -
> http://wiki.apache.org/solr
As per the example in the wiki - http://wiki.apache.org/solr/DataImportHandler
- I am seeing the following fragment.
..
My scaled-down application looks very similar along these lines but where my
resultset is s
13 matches
Mail list logo