In our current lab project, we already built a Chinese newspaper index with
18 millions documents. The index size is around 51GB. So I am very concerned
about the memory issue you guys mentioned.
I also look up the Hathitrust report on SolrPerformanceData page:
http://wiki.apache.org/solr/SolrPerformanceData. They said their main
bottleneck is Disk-I/O even they have 10 shards spread over 4 servers.
Can you guys give me some helpful suggestion about hardward spec & memory
configuration on our project?
Thanks in advance.
Scott
----- Original Message -----
From: "Lance Norskog" <goks...@gmail.com>
To: <solr-user@lucene.apache.org>
Sent: Tuesday, August 31, 2010 1:01 PM
Subject: Re: Hardware Specs Question
There are synchronization points, which become chokepoints at some
number of cores. I don't know where they cause Lucene to top out.
Lucene apps are generally disk-bound, not CPU-bound, but yours will
be. There are so many variables that it's really not possible to give
any numbers.
Lance
On Mon, Aug 30, 2010 at 8:34 PM, Amit Nithian <anith...@gmail.com> wrote:
Lance,
makes sense and I have heard about the long GC times on large heaps but I
personally haven't experienced a slowdown but that doesn't mean anything
either :-). Agreed that tuning the SOLR caching is the way to go.
I haven't followed all the solr/lucene changes but from what I remember
there are synchronization points that could be a bottleneck where adding
more cores won't help this problem? Or am I completely missing something.
Thanks again
Amit
On Mon, Aug 30, 2010 at 8:28 PM, scott chu (朱炎詹)
<scott....@udngroup.com>wrote:
I am also curious as Amit does. Can you make an example about the garbage
collection problem you mentioned?
----- Original Message ----- From: "Lance Norskog" <goks...@gmail.com>
To: <solr-user@lucene.apache.org>
Sent: Tuesday, August 31, 2010 9:14 AM
Subject: Re: Hardware Specs Question
It generally works best to tune the Solr caches and allocate enough
RAM to run comfortably. Linux & Windows et. al. have their own cache
of disk blocks. They use very good algorithms for managing this cache.
Also, they do not make long garbage collection passes.
On Mon, Aug 30, 2010 at 5:48 PM, Amit Nithian <anith...@gmail.com>
wrote:
Lance,
Thanks for your help. What do you mean by that the OS can keep the
index
in
memory better than Solr? Do you mean that you should use another means
to
keep the index in memory (i.e. ramdisk)? Is there a generally accepted
heap
size/index size that you follow?
Thanks
Amit
On Mon, Aug 30, 2010 at 5:00 PM, Lance Norskog <goks...@gmail.com>
wrote:
The price-performance knee for small servers is 32G ram, 2-6 SATA
disks on a raid, 8/16 cores. You can buy these servers and half-fill
them, leaving room for expansion.
I have not done benchmarks about the max # of processors that can be
kept busy during indexing or querying, and the total numbers: QPS,
response time averages & variability, etc.
If your index file size is 8G, and your Java heap is 8G, you will do
long garbage collection cycles. The operating system is very good at
keeping your index in memory- better than Solr can.
Lance
On Mon, Aug 30, 2010 at 4:52 PM, Amit Nithian <anith...@gmail.com>
wrote:
> Hi all,
>
> I am curious to know get some opinions on at what point having more
> >
CPU
> cores shows diminishing returns in terms of QPS. Our index size is >
about
8GB
> and we have 16GB of RAM on a quad core 4 x 2.4 GHz AMD Opteron 2216.
> Currently I have the heap to 8GB.
>
> We are looking to get more servers to increase capacity and because
> >
the
> warranty is set to expire on our old servers and so I was curious >
before
> asking for a certain spec what others run and at what point does >
having
more
> cores cease to matter? Mainly looking at somewhere between 4-12
> cores
> per
> server.
>
> Thanks!
> Amit
>
--
Lance Norskog
goks...@gmail.com
--
Lance Norskog
goks...@gmail.com
--------------------------------------------------------------------------------
___b___J_T_________f_r_C
Checked by AVG - www.avg.com
Version: 9.0.851 / Virus Database: 271.1.1/3102 - Release Date: 08/30/10
14:35:00
--
Lance Norskog
goks...@gmail.com
--------------------------------------------------------------------------------
___b___J_T_________f_r_C
Checked by AVG - www.avg.com
Version: 9.0.851 / Virus Database: 271.1.1/3103 - Release Date: 08/31/10
02:34:00