On 11/4/2018 8:38 AM, Chuming Chen wrote:
I have shared a tar ball with you (apa...@elyograg.org) from google drive. The 
tar ball includes logs directories of 4 nodes, solrconfig.xml, solr.in.sh, and 
screenshot of TOP command. The log files is about 1 day’s log. However, I 
restarted the solr cloud several times during that period.

Runtime represented in the GC log for node1 is 23 minutes. Not anywhere near a full day.

Runtime represented in thc GC log for node2 is just under 16 minutes.

Runtime represented in the GC log for node3 is 434 milliseconds.

Runtime represented in the GC log for node4 is 501 milliseconds.

This is not enough to even make a guess, much less a reasoned recommendation about the heap size you will actually need.  There must be enough runtime that there have been significant garbage collections so we can get a sense about how much memory the application actually needs.

I want to make it clear. I don’t have 4 physical machines. I have 48 cores 
server. All 4 solr nodes are running on the same physical machine. Each node 
has 1 shard and 1 replicate. I also have a ZooKeeper ensemble running on the 
same machine with 3 different ports.

Why?  You get absolutely no redundancy that way.  One Solr instance and one ZK instance would be more efficient on a single server.  The increase in efficiency probably wouldn't be significant, but it WOULD be more efficient.  You really can't get a sense about how separate servers will behave if all the software is running on a single server.

I am curious to know what Solr is doing when the CPU usage is 100% or more than 
100%. Because for some queries, I think even just looping through all the 
document without using any index might be faster.

I have no way to answer this question.  Solr will be doing whatever you asked it to do.

The screenshot of the top output shows that all four of the nodes there are using about 3GB of memory each (RES minus SHR).  Which would be consistent with the very short runtimes noted by the GC logs.  The VIRT column reveals that each node has about 100GB of index data.  So about 400GB total index data.  Not much can be determined when the runtime is so small.

Thanks,
Shawn

Reply via email to