I have three CentOS machines running Solr 4.6.0 cloud without any replication. That is, numshards is 3 and there is only one Solr instance running on each of the boxes.

Also, on the boxes I arm running ZooKeeper. This is a test environment and I would not normally run ZooKeeper on the same boxes.

As I am inserting data into Solr the boxes get in a weird state. I will log in and enter my username and password and then nothing, it just sits there. I am connected through Putty. Never gets to a command prompt. I stop the data import and after a while I can log in.

I do the following command on one of the boxes and I see this:

    ps -lf -C java

F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD 0 S root 4772 1 99 80 0 - 1926607 futex_ 12:13 pts/0 213852-21:10:31 java -Dzookeeper.log.dir=. -Dzookeeper.root.logger=INFO,CONSOLE -cp /var/zookeeper/bin/../build/classes:/var/zookeeper/bin/../build/lib/*.jar:/var/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/var/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/var/zookeeper/bin/../lib/netty-3.2.2.Final.jar:/var/zookeeper/bin/../lib/log4j-1.2.15.jar:/var/zookeeper/bin/../lib/jline-0.9.94.jar:/var/zookeeper/bin/../zookeeper-3.4.5.jar:/var/zookeeper/bin/../src/java/lib/*.jar:/var/zookeeper/bin/../conf: -Xms1G -Xmx4G -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.local.only=false org.apache.zookeeper.server.quorum.QuorumPeerMain /var/zookeeper/bin/../conf/zoo.cfg 0 S root 5009 1 99 80 0 - 46184325 futex_ 12:26 pts/0 219341-04:38:50 /usr/bin/java -Dbootstrap_confdir=./solr/mycore/conf -Xms6G -Xmx12G -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=3000 -Dcollection.configName=amtcacheconf -DzkHost=prdslslbgtmdb01:2181,prdslslbgtmdb03:2181,prdslslbgtmdb04:2181 -DnumShards=3 -jar start.jar 1 D root 7879 5009 99 80 0 - 46184325 sched_ 15:40 pts/0 208-11:14:20 /usr/bin/java -Dbootstrap_confdir=./solr/mycore/conf -Xms6G -Xmx12G -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=3000 -Dcollection.configName=amtcacheconf -DzkHost=prdslslbgtmdb01:2181,prdslslbgtmdb03:2181,prdslslbgtmdb04:2181 -DnumShards=3 -jar start.jar 1 D root 7949 5009 99 80 0 - 46184325 sched_ 15:44 pts/0 208-11:14:20 /usr/bin/java -Dbootstrap_confdir=./solr/mycore/conf -Xms6G -Xmx12G -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=3000 -Dcollection.configName=amtcacheconf -DzkHost=prdslslbgtmdb01:2181,prdslslbgtmdb03:2181,prdslslbgtmdb04:2181 -DnumShards=3 -jar start.jar


How did I end up with two child processes of Solr running? Notice they are two PIDS, 7879 and 7949, that are children of 5009. The exact same command as well, with all of the parameters I used to launch Solr.

I also notice the "F" state is "1" for those two processes, so I assume that means "forked but didn't exec".

Also the WCHAN is sched_ on both of them.

The "S" state is "D" which means uninterruptible sleep ( usually IO ).

Where are these processes coming from? Do I have something configured incorrectly?

Reply via email to