On 11/4/2018 8:38 AM, Chuming Chen wrote:
I have shared a tar ball with you (apa...@elyograg.org) from google drive. The
tar ball includes logs directories of 4 nodes, solrconfig.xml, solr.in.sh, and
screenshot of TOP command. The log files is about 1 day’s log. However, I
restarted the solr cloud several times during that period.
Runtime represented in the GC log for node1 is 23 minutes. Not anywhere
near a full day.
Runtime represented in thc GC log for node2 is just under 16 minutes.
Runtime represented in the GC log for node3 is 434 milliseconds.
Runtime represented in the GC log for node4 is 501 milliseconds.
This is not enough to even make a guess, much less a reasoned
recommendation about the heap size you will actually need. There must
be enough runtime that there have been significant garbage collections
so we can get a sense about how much memory the application actually needs.
I want to make it clear. I don’t have 4 physical machines. I have 48 cores
server. All 4 solr nodes are running on the same physical machine. Each node
has 1 shard and 1 replicate. I also have a ZooKeeper ensemble running on the
same machine with 3 different ports.
Why? You get absolutely no redundancy that way. One Solr instance and
one ZK instance would be more efficient on a single server. The
increase in efficiency probably wouldn't be significant, but it WOULD be
more efficient. You really can't get a sense about how separate servers
will behave if all the software is running on a single server.
I am curious to know what Solr is doing when the CPU usage is 100% or more than
100%. Because for some queries, I think even just looping through all the
document without using any index might be faster.
I have no way to answer this question. Solr will be doing whatever you
asked it to do.
The screenshot of the top output shows that all four of the nodes there
are using about 3GB of memory each (RES minus SHR). Which would be
consistent with the very short runtimes noted by the GC logs. The VIRT
column reveals that each node has about 100GB of index data. So about
400GB total index data. Not much can be determined when the runtime is
so small.
Thanks,
Shawn