Re: Limit of Index size per machine..

2009-08-06 Thread Tom Burton-West
Hello, I think you are confusing the size of the data you want to index with the size of the index. For our indexes (large full text documents) the Solr index is about 1/3 of the size of the documents being indexed. For 3 TB of data you might have an index of 1 TB or less. This depends on many

Re: Limit of Index size per machine..

2009-08-06 Thread Ian Connor
our > system Engineers. i.e In case if we run into any HDFS issues, SEs won't be > supporting us :( > > Regards, > sS > > --- On Thu, 8/6/09, Walter Underwood wrote: > > > From: Walter Underwood > > Subject: Re: Limit of Index size per machine.. > >

Re: Limit of Index size per machine..

2009-08-05 Thread Silent Surfer
Underwood wrote: > From: Walter Underwood > Subject: Re: Limit of Index size per machine.. > To: solr-user@lucene.apache.org > Date: Thursday, August 6, 2009, 5:12 AM > That is why people don't use search > engines to manage logs. Look at a  > Hadoop cluster. > &g

Re: Limit of Index size per machine..

2009-08-05 Thread Walter Underwood
it of Index size per machine.. To: solr-user@lucene.apache.org Date: Wednesday, August 5, 2009, 9:38 PM I try to keep the index directory size less than the amount of RAM and rely on the OS to cache as it needs. Linux does a pretty good job here and I am sure OS X will do a good job also. Distri

Re: Limit of Index size per machine..

2009-08-05 Thread Silent Surfer
how many servers were used for indexing alone. Thanks, sS --- On Wed, 8/5/09, Ian Connor wrote: > From: Ian Connor > Subject: Re: Limit of Index size per machine.. > To: solr-user@lucene.apache.org > Date: Wednesday, August 5, 2009, 9:38 PM > I try to keep the index directory &

Re: Limit of Index size per machine..

2009-08-05 Thread Ian Connor
I try to keep the index directory size less than the amount of RAM and rely on the OS to cache as it needs. Linux does a pretty good job here and I am sure OS X will do a good job also. Distributed search here will be your friend so you can chunk it up to a number of servers to keep your cost down