Re: Large Data Set Suggestions

2008-11-08 Thread Noble Paul നോബിള്‍ नोब्ळ्
t; From: Noble Paul നോബിള്‍ नोब्ळ् [mailto:[EMAIL PROTECTED] > Sent: Thursday, November 06, 2008 8:39 PM > To: solr-user@lucene.apache.org > Subject: Re: Large Data Set Suggestions > > Hi Lance, > This is one area we left open in DIH. What is the best way to handle this. On > error

RE: Large Data Set Suggestions

2008-11-07 Thread Lance Norskog
rom: Noble Paul നോബിള്‍ नोब्ळ् [mailto:[EMAIL PROTECTED] Sent: Thursday, November 06, 2008 8:39 PM To: solr-user@lucene.apache.org Subject: Re: Large Data Set Suggestions Hi Lance, This is one area we left open in DIH. What is the best way to handle this. On error it should give up or continue w

Re: Large Data Set Suggestions

2008-11-07 Thread Noble Paul നോബിള്‍ नोब्ळ्
essage- > From: Noble Paul ??? ?? [mailto:[EMAIL PROTECTED] > Sent: Thu 11/6/2008 11:38 PM > To: solr-user@lucene.apache.org > Subject: Re: Large Data Set Suggestions > > Hi Lance, > This is one area we left open in DIH. What is the best way to handle > this

RE: Large Data Set Suggestions

2008-11-07 Thread Steven Anderson
Ideally, it would be a configuration option. Also, it would be great to have a hook to log or process an exception. Steve -Original Message- From: Noble Paul ??? ?? [mailto:[EMAIL PROTECTED] Sent: Thu 11/6/2008 11:38 PM To: solr-user@lucene.apache.org Subject: Re: Large Data

Re: Large Data Set Suggestions

2008-11-06 Thread Noble Paul നോബിള്‍ नोब्ळ्
Message- > From: Steven Anderson [mailto:[EMAIL PROTECTED] > Sent: Thursday, November 06, 2008 5:57 AM > To: solr-user@lucene.apache.org > Subject: RE: Large Data Set Suggestions > >> In that case you may put the file in a mounted NFS directory or you >> can serve it ou

RE: Large Data Set Suggestions

2008-11-06 Thread Lance Norskog
error. Lance -Original Message- From: Steven Anderson [mailto:[EMAIL PROTECTED] Sent: Thursday, November 06, 2008 5:57 AM To: solr-user@lucene.apache.org Subject: RE: Large Data Set Suggestions > In that case you may put the file in a mounted NFS directory or you > can serve it ou

Re: Large Data Set Suggestions

2008-11-06 Thread Walter Underwood
100X, not 10X. And with the index on NFS. Reading the input data from NFS would be slower than local, but probably not 10X. --wunder On 11/6/08 5:56 AM, "Steven Anderson" <[EMAIL PROTECTED]> wrote: > That's one option although someone else on the list mentioned that > performance was 10x slower i

RE: Large Data Set Suggestions

2008-11-06 Thread Steven Anderson
> In that case you may put the file in a mounted NFS directory > or you can serve it out with an apache server. That's one option although someone else on the list mentioned that performance was 10x slower in their NFS experience. Another option is to serve up the files via Apache and pull them

Re: Large Data Set Suggestions

2008-11-06 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Thu, Nov 6, 2008 at 7:04 PM, Steven Anderson <[EMAIL PROTECTED]> wrote: >> The performance of DIH is likely to be faster than SolrJ. >> Because , it does not have the overhead of an http request. > > Understood. However, we may not have the option of co-locating the data > to be injested with t

RE: Large Data Set Suggestions

2008-11-06 Thread Steven Anderson
> The performance of DIH is likely to be faster than SolrJ. > Because , it does not have the overhead of an http request. Understood. However, we may not have the option of co-locating the data to be injested with the Solr server. > What is your data source? I am assuming it is xml. Yes. Inco

Re: Large Data Set Suggestions

2008-11-05 Thread Noble Paul നോബിള്‍ नोब्ळ्
The performance of DIH is likely to be faster than SolrJ. Because , it does not have the overhead of an http request. What is your data source? I am assuming it is xml. SolrJ cannot directly index xml . You may need to read docs from xml before solrj can index it. --Noble On Wed, Nov 5, 2008 at

Re: Large Data Set Suggestions

2008-11-05 Thread souravm
Hi Fergus, Does the 6.6m doc resides on a single box (node) or multiple boxes ? Do u use distributed search ? Regards, Sourav - Original Message - From: Fergus McMenemie <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Wed Nov 05 08:21:45 2008 Subject: Re: Large Da

Re: Large Data Set Suggestions

2008-11-05 Thread Fergus McMenemie
>Greetings! > >I've been asked to do some indexing performance testing on Solr 1.3 >using large XML document data sets (10M-60M docs) with DIH versus SolrJ. > > >Does anyone have any suggestions where I might find a good data set this >size? > >I saw the wikipedia dump reference in the DIH wik

Large Data Set Suggestions

2008-11-05 Thread Steven Anderson
Greetings! I've been asked to do some indexing performance testing on Solr 1.3 using large XML document data sets (10M-60M docs) with DIH versus SolrJ. Does anyone have any suggestions where I might find a good data set this size? I saw the wikipedia dump reference in the DIH wiki, but tha