Re: Solr Capacity Planning

2017-06-18 Thread Erick Erickson
I have no idea what Will's comment means, but I will say that at that scale you'd probably be well advised to get some professional consulting help, it'll save you a world of hassle. The basic sizing exercise is here: https://lucidworks.com/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-ha

Re: Solr Capacity Planning

2017-06-17 Thread Will Martin
MODERATOR REQUESTED: > On Jun 17, 2017, at 3:56 AM, Greenhorn Techie > wrote: > > Hi, > > We are planning to setup a Solr cloud for building a search application on > huge volumes of data points (~hundreds of billions of solr documents) I > would like to understand if there is any recommendat

Solr Capacity Planning

2017-06-17 Thread Greenhorn Techie
Hi, We are planning to setup a Solr cloud for building a search application on huge volumes of data points (~hundreds of billions of solr documents) I would like to understand if there is any recommendation on how to size the infrastructure and hardware requirements for Solr clusters. Also, what a

Re: SOLR capacity planning and Disaster relief

2012-10-23 Thread Otis Gospodnetic
Hi Worty, On Sun, Oct 21, 2012 at 2:30 AM, Worthy LaFollette wrote: > CAVEAT: I am a nubie w/r to SOLR (some Lucene experience, but not SOLR > itself. Trying to come up to speed. > > > What have you all done w/r to SOLR capacity planning and disaster relief? Re capacity plannin

SOLR capacity planning and Disaster relief

2012-10-20 Thread Worthy LaFollette
CAVEAT: I am a nubie w/r to SOLR (some Lucene experience, but not SOLR itself. Trying to come up to speed. What have you all done w/r to SOLR capacity planning and disaster relief? I am curious to the following metrics: - File handles and other ulimit/profile concerns - Space calculations

Re: Capacity Planning Guidance

2012-07-13 Thread Erick Erickson
This question, reasonable as it appears, is just unanswerable in the abstract. About all you can do is prototype and test. Take "facet queries". The hardware requirements vary drastically based on the number of unique values in the field(s) you're faceting on, as well as whether they're multi-value

Re: Architecture and Capacity planning for large Solr index

2011-11-23 Thread Erick Erickson
ed search and failover - start with SolrCloud, there are a couple >>> of pages about it on the Wiki and >>> http://blog.sematext.com/2011/09/14/solr-digest-spring-summer-2011-part-2-solr-cloud-and-near-real-time-search/ >>> >>> Otis >>> >>>

Re: Architecture and Capacity planning for large Solr index

2011-11-21 Thread Rahul Warawdekar
>> Lucene ecosystem search :: http://search-lucene.com/ >> >> >> > >> >From: Rahul Warawdekar >> >To: solr-user >> >Sent: Tuesday, October 11, 2011 11:47 AM >> >Subject: Architecture and Capacity planning fo

Re: Architecture and Capacity planning for large Solr index

2011-11-21 Thread Rahul Warawdekar
> >To: solr-user > >Sent: Tuesday, October 11, 2011 11:47 AM > >Subject: Architecture and Capacity planning for large Solr index > > > >Hi All, > > > >I am working on a Solr search based project, and would highly appreciate > >help/suggestions from yo

Re: capacity planning

2011-10-13 Thread Shawn Heisey
On 10/11/2011 11:49 AM, Toke Eskildsen wrote: Inline or top-posting? Long discussion, but for mailing lists I clearly prefer the former. Ditto. ;) I have little experience with VM servers for search. Although we use a lot of virtual machines, we use dedicated machines for our searchers, prim

RE: capacity planning

2011-10-13 Thread Jaeger, Jay - DOT
Message- From: eks...@googlemail.com [mailto:eks...@googlemail.com] On Behalf Of eks dev Sent: Tuesday, October 11, 2011 1:20 PM To: solr-user@lucene.apache.org Subject: Re: capacity planning Re. "I have little experience with VM servers for search." We had huge performance penalty on VMs

Re: capacity planning

2011-10-11 Thread Travis Low
Our plan for the VM is just benchmarking, not production. We will turn off all guest machines, then configure a Solr VM. Then we'll tweak memory and see what effect it has on indexing and searching. Then we'll reconfigure the number of processors used and see what that does. Then again with mor

Re: capacity planning

2011-10-11 Thread eks dev
Re. "I have little experience with VM servers for search." We had huge performance penalty on VMs, CPU was bottleneck. We couldn't freely run measurements to figure out what the problem really was (hosting was contracted by customer...), but it was something pretty scary, kind of 8-10 times slowe

Re: capacity planning

2011-10-11 Thread Otis Gospodnetic
the fields. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ > >From: Erik Hatcher >To: solr-user@lucene.apache.org >Sent: Tuesday, October 11, 2011 9:49 AM >Subject: Re: ca

Re: Architecture and Capacity planning for large Solr index

2011-10-11 Thread Otis Gospodnetic
e.com/ > >From: Rahul Warawdekar >To: solr-user >Sent: Tuesday, October 11, 2011 11:47 AM >Subject: Architecture and Capacity planning for large Solr index > >Hi All, > >I am working on a Solr search based project, and would highly appreciate >help/suggestions from

Re: capacity planning

2011-10-11 Thread Toke Eskildsen
Travis Low [t...@4centurion.com] wrote: > Toke, thanks. Comments embedded (hope that's okay): Inline or top-posting? Long discussion, but for mailing lists I clearly prefer the former. [Toke: Estimate characters] > Yes. We estimate each of the 23K DB records has 600 pages of text for the > co

Re: capacity planning

2011-10-11 Thread Travis Low
Toke, thanks. Comments embedded (hope that's okay): On Tue, Oct 11, 2011 at 10:52 AM, Toke Eskildsen wrote: > > Greetings. I have a paltry 23,000 database records that point to a > > voluminous 300GB worth of PDF, Word, Excel, and other documents. We are > > planning on indexing the records a

Architecture and Capacity planning for large Solr index

2011-10-11 Thread Rahul Warawdekar
Hi All, I am working on a Solr search based project, and would highly appreciate help/suggestions from you all regarding Solr architecture and capacity planning. Details of the project are as follows 1. There are 2 databases from which, data needs to be indexed and made searchable

Re: capacity planning

2011-10-11 Thread Toke Eskildsen
On Tue, 2011-10-11 at 14:36 +0200, Travis Low wrote: > Greetings. I have a paltry 23,000 database records that point to a > voluminous 300GB worth of PDF, Word, Excel, and other documents. We are > planning on indexing the records and the documents they point to. I have no > clue on how we can c

Re: capacity planning

2011-10-11 Thread Travis Low
Thanks, Erik! We probably won't use highlighting. Also, documents are added but *never* deleted. Does anyone have comments about memory and CPU resources required for indexing the 300GB of documents in a "reasonable" amount of time? It's okay if the initial indexing takes hours or maybe even da

Re: capacity planning

2011-10-11 Thread Paul Libbrecht
My experience was 10% of the size. Le 11 oct. 2011 à 15:49, Erik Hatcher a écrit : > (roughly 35% the size, generally).

Re: capacity planning

2011-10-11 Thread Erik Hatcher
Travis - Whether the index is bigger than the original content depends on what you need to do with it in Solr. One of the primary deciding factors is if you need to use highlighting, which currently requires the fields to be highlighted be stored. Stored fields will take up about the same spa

capacity planning

2011-10-11 Thread Travis Low
Greetings. I have a paltry 23,000 database records that point to a voluminous 300GB worth of PDF, Word, Excel, and other documents. We are planning on indexing the records and the documents they point to. I have no clue on how we can calculate what kind of server we need for this. I imagine the