As an additional bit of information, here's the tcpdump of my startup of
solr in the docker container, after logging into the container and running
"bin/solr start -f -c" (which is the same CMD my Dockerfile executes):

root@91e3883fb675:/opt/solr-8.2.0# tcpdump -nvvv -i any -c 100 host
172.20.60.138
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size
262144 bytes
21:54:49.426019 IP (tos 0x0, ttl 64, id 44803, offset 0, flags [DF], proto
TCP (6), length 60)
    172.17.0.2.60562 > 172.20.60.138.2181: Flags [S], cksum 0x94e0
(incorrect -> 0x19d3), seq 2175798173, win 29200, options [mss
1460,sackOK,TS val 6792350 ecr 0,nop,wscale 7], length 0
21:54:49.472340 IP (tos 0x0, ttl 37, id 37699, offset 0, flags [none],
proto TCP (6), length 48)
    172.20.60.138.2181 > 172.17.0.2.60562: Flags [S.], cksum 0xd892
(correct), seq 452884582, ack 2175798174, win 65535, options [mss
1460,wscale 2,eol], length 0
21:54:49.472428 IP (tos 0x0, ttl 64, id 44804, offset 0, flags [DF], proto
TCP (6), length 40)
    172.17.0.2.60562 > 172.20.60.138.2181: Flags [.], cksum 0x94cc
(incorrect -> 0x0472), seq 1, ack 1, win 229, length 0
21:54:49.472950 IP (tos 0x0, ttl 64, id 44805, offset 0, flags [DF], proto
TCP (6), length 89)
    172.17.0.2.60562 > 172.20.60.138.2181: Flags [P.], cksum 0x94fd
(incorrect -> 0x8ecb), seq 1:50, ack 1, win 229, length 49
21:54:49.473400 IP (tos 0x0, ttl 37, id 33425, offset 0, flags [none],
proto TCP (6), length 40)
    172.20.60.138.2181 > 172.17.0.2.60562: Flags [.], cksum 0x0526
(correct), seq 1, ack 50, win 65535, length 0
21:54:59.448636 IP (tos 0x0, ttl 64, id 44806, offset 0, flags [DF], proto
TCP (6), length 40)
    172.17.0.2.60562 > 172.20.60.138.2181: Flags [F.], cksum 0x94cc
(incorrect -> 0x0440), seq 50, ack 1, win 229, length 0
21:54:59.449070 IP (tos 0x0, ttl 37, id 3430, offset 0, flags [none], proto
TCP (6), length 40)
    172.20.60.138.2181 > 172.17.0.2.60562: Flags [.], cksum 0x0525
(correct), seq 1, ack 51, win 65535, length 0
21:55:21.518447 IP (tos 0x0, ttl 37, id 2259, offset 0, flags [none], proto
TCP (6), length 40)
    172.20.60.138.2181 > 172.17.0.2.60562: Flags [F.], cksum 0x0524
(correct), seq 1, ack 51, win 65535, length 0
21:55:21.518513 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP
(6), length 40)
    172.17.0.2.60562 > 172.20.60.138.2181: Flags [.], cksum 0x043f
(correct), seq 51, ack 2, win 229, length 0

172.17.0.2 is my solr docker container, 172.20.60.138 is my zk1 docker
container residing out in AWS.

>From this, it looks like communication is happening but that it's finishing
and closing the connection instead of holding it open. Am I interpreting
this correctly?


--
Drew(i...@gmail.com)
http://wyntermute.dyndns.org/blog/

-- I Drive Way Too Fast To Worry About Cholesterol.


On Fri, Oct 18, 2019 at 1:18 PM Drew Kidder <dre...@gmail.com> wrote:

> Again, thank you all for the suggestions.
>
> My ZK ensemble is talking to each other and the outside world:
>
> solr@fe0ad5b40b42:/etc/default# echo srvr | nc zk1.zookeeper.internal 2181
> Zookeeper version: 3.5.5-390fe37ea45dee01bf87dc1c042b5e3dcce88653, built
> on 05/03/2019 12:07 GMT
> Latency min/avg/max: 0/0/0
> Received: 53
> Sent: 33
> Connections: 1
> Outstanding: 19
> Zxid: 0x0
> Mode: follower
> Node count: 5
>
> solr@fe0ad5b40b42:/etc/default# echo srvr | nc zk2.zookeeper.internal 2181
> Zookeeper version: 3.5.5-390fe37ea45dee01bf87dc1c042b5e3dcce88653, built
> on 05/03/2019 12:07 GMT
> Latency min/avg/max: 0/0/0
> Received: 37
> Sent: 17
> Connections: 1
> Outstanding: 19
> Zxid: 0x200000000
> Mode: leader
> Node count: 5
> Proposal sizes last/min/max: 32/32/36
>
> solr@fe0ad5b40b42:/etc/default# echo srvr | nc zk3.zookeeper.internal 2181
> Zookeeper version: 3.5.5-390fe37ea45dee01bf87dc1c042b5e3dcce88653, built
> on 05/03/2019 12:07 GMT
> Latency min/avg/max: 0/0/0
> Received: 7
> Sent: 3
> Connections: 1
> Outstanding: 3
> Zxid: 0x200000000
> Mode: follower
> Node count: 5
>
> All of these commands can be executed on the solr container as either the
> root user or the solr user (see the command prompt in each command). Note
> that zk2 is the leader and zk1 and zk3 are followers. The configuration
> files (including the ZOO_MY_ID and ZOO_SERVERS environment variables) are
> all set up correctly and by all rights and purposes, ZK appears to be set
> up correctly and functioning.
>
> Jorne Franke: I tried implementing your suggestion of providing "/" as the
> root node by appending "/" to the end of the ZK_HOST connection string and
> it still did not work (e.g. ENV ZK_HOST
> zk1.zookeeper.internal:2181,zk2.zookeeper.internal:2181,zk3.zookeeper.internal:2181/
> in the Dockerfile). Was this what you meant?  Or were you suggesting to set
> the ZK_ROOT in the Solr configs/environment instead?
>
> --
> Drew(i...@gmail.com)
> http://wyntermute.dyndns.org/blog/
>
> -- I Drive Way Too Fast To Worry About Cholesterol.
>
>
> On Fri, Oct 18, 2019 at 12:11 PM Ahmed Adel <aa.0...@gmail.com> wrote:
>
>> This could be because Zookeeper ensemble is not properly configured. Using
>> a very similar setup which consists of ZK cluster of three hosts and one
>> Solr Cloud node (all are containers), the system got running. Each ZK host
>> has ZOO_MY_ID and ZOO_SERVERS environment variables set before running ZK.
>> In this case, the former variable value would be from 1 to 3 on each host
>> and the latter would be "server.1=z1:2888:3888;2181
>> server.2=z2:2888:3888;2181 server.3=z3:2888:3888;2181" the same on all
>> hosts (the double quotes may be needed for proper parsing). This
>> ZOO_SERVERS syntax is for ZK version 3.5. 3.4 is slightly different.
>>
>> http://aadel.io
>>
>> On Fri, Oct 18, 2019 at 5:28 PM Drew Kidder <dre...@gmail.com> wrote:
>>
>> > Thank you all for your suggestions! I appreciate the fast turnaround.
>> >
>> > My setup is using Amazon ECS for our solr cloud installation. Each ZK
>> is in
>> > its own container, using Route53 Service Discovery to provide the DNS
>> name.
>> > The ZK nodes can all talk to each other, and I can communicate to each
>> one
>> > of those nodes from my local machine and from within the solr container.
>> > Solr is one node per container, as Martijn correctly assumed. I am not
>> > using a zkRoot at present because my intention is to use ZK solely for
>> Solr
>> > Cloud and nothing else.
>> >
>> > I have tried removing the "-z" option from the Dockerfile CMD and using
>> the
>> > ZK_HOST environment variable (see below). I have even also modified the
>> > solr.in.sh and set the ZK_HOST variable there, all to no avail. I have
>> > tried both the Dockerfile command route, and have logged into the solr
>> > container and tried to run the CMD manually to see if there was a
>> problem
>> > with the way I was using the CMD entry. All of those methods give me the
>> > same result output captured in the gist below.
>> >
>> > The gist for my solr.log output is here:
>> > https://gist.github.com/dkidder/2db9a6d393dedb97a39ed32e2be0c087
>> >
>> > My Dockerfile for the solr container looks like this:
>> >
>> >
>> > FROM    solr:8.2
>> >
>> > EXPOSE    8983 8999 2181
>> >
>> > VOLUME    /app/logs
>> > VOLUME    /app/data
>> > VOLUME    /app/conf
>> >
>> > ## add our jetty configuration (increased request size!)
>> > COPY   jetty.xml /opt/solr/server/etc/
>> >
>> > ## SolrCloud configuration
>> > ENV     ZK_HOST zk1:2181,zk2:2181,zk3:2181
>> > ENV     ZK_CLIENT_TIMEOUT 30000
>> >
>> > USER   root
>> > RUN    apt-get update
>> > RUN    apt-get install -y netcat net-tools vim procps
>> > USER   solr
>> >
>> > # Copy over custom solr plugins
>> > COPY    myplugins/src/resources/* /opt/solr/server/solr/my-resources/
>> > COPY    lib/*.jar /opt/solr/my-lib/
>> >
>> > # Copy over my configs
>> > COPY    conf/ /app/conf
>> >
>> > #Start solr in cloud mode, connecting to zookeeper
>> > CMD       ["solr","start","-f","-c"]
>> >
>> > The docker command I use to execute this Dockerfile is `docker run -p
>> > 8983:8983 -p 2181:2181 --name $(APP_NAME) $(APP_NAME):latest`
>> >
>> > Output of `ps -eflww` from within the solr container (as root):
>> >
>> > root@fe0ad5b40b42:/opt/solr-8.2.0# ps -eflww
>> > F S UID        PID  PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY
>> TIME
>> > CMD
>> > 4 S solr         1     0  9  80   0 - 1043842 -    14:36 ?
>> 00:00:07
>> > /usr/local/openjdk-11/bin/java -server -Xms512m -Xmx512m -XX:+UseG1GC
>> > -XX:+PerfDisableSharedMem -XX:+ParallelRefProcEnabled
>> > -XX:MaxGCPauseMillis=250 -XX:+UseLargePages -XX:+AlwaysPreTouch
>> >
>> >
>> -Xlog:gc*:file=/var/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
>> > -Dcom.sun.management.jmxremote
>> > -Dcom.sun.management.jmxremote.local.only=false
>> > -Dcom.sun.management.jmxremote.ssl=false
>> > -Dcom.sun.management.jmxremote.authenticate=false
>> > -Dcom.sun.management.jmxremote.port=18983
>> > -Dcom.sun.management.jmxremote.rmi.port=18983 -DzkClientTimeout=30000
>> > -DzkHost=zk1:2181,zk2:2181,zk3:2181 -Dsolr.log.dir=/var/solr/logs
>> > -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks
>> -Duser.timezone=UTC
>> > -Djetty.home=/opt/solr/server -Dsolr.solr.home=/var/solr/data
>> > -Dsolr.data.home= -Dsolr.install.dir=/opt/solr
>> > -Dsolr.default.confdir=/opt/solr/server/solr/configsets/_default/conf
>> > -Dlog4j.configurationFile=file:/var/solr/log4j2.xml -Xss256k
>> > -Dsolr.jetty.https.port=8983 -jar start.jar --module=http
>> > 4 S root        90     0  0  80   0 -  4988 -      14:37 pts/0
>> 00:00:00
>> > /bin/bash
>> > 0 R root        95    90  0  80   0 -  9595 -      14:37 pts/0
>> 00:00:00
>> > ps -eflww
>> >
>> > Output of netstat from within the solr container (as root):
>> >
>> > root@fe0ad5b40b42:/opt/solr-8.2.0# netstat
>> > Active Internet connections (w/o servers)
>> > Proto Recv-Q Send-Q Local Address           Foreign Address
>>  State
>> > tcp        0      0 fe0ad5b40b42:43678      172.20.28.179:2181
>> >  TIME_WAIT
>> > tcp        0      0 fe0ad5b40b42:60164      172.20.155.241:2181
>> > TIME_WAIT
>> > tcp        0      0 fe0ad5b40b42:60500      172.20.60.138:2181
>> >  TIME_WAIT
>> > Active UNIX domain sockets (w/o servers)
>> > Proto RefCnt Flags       Type       State         I-Node   Path
>> > unix  2      [ ]         STREAM     CONNECTED     129252
>> > unix  2      [ ]         STREAM     CONNECTED     129270
>> >
>> > I'm beginning to think that ZK is not setup correctly. I haven't
>> uploaded
>> > any configuration files to ZK yet; my understanding was that I could
>> start
>> > up a solr cloud node with no collections and upload the configuration
>> from
>> > there. I was under the impression that it would try to connect to ZK
>> and if
>> > it couldn't get config files from there it would use local config
>> files. Do
>> > I need to upload the solr cloud configuration files to ZK before
>> starting
>> > up the cluster?  The netstat output makes it look like the solr
>> container
>> > is indeed connected to the ZK containers, but there's no indication as
>> to
>> > why it cannot connect to Zookeeper that I can see.
>> >
>> > --
>> > Drew(i...@gmail.com)
>> > http://wyntermute.dyndns.org/blog/
>> >
>> > -- I Drive Way Too Fast To Worry About Cholesterol.
>> >
>> >
>> > On Fri, Oct 18, 2019 at 3:11 AM Martijn Koster <
>> > mak-luc...@greenhills.co.uk>
>> > wrote:
>> >
>> > >
>> > >
>> > > > On 18 Oct 2019, at 00:25, Drew Kidder <dre...@gmail.com> wrote:
>> > >
>> > > > * I'm using the following command line to start a basic solr cloud
>> > > instance
>> > > > as per the documentation: `bin/solr start -c -z
>> > > zk1:2181,zk2:2181,zk3:2181`
>> > >
>> > > I assume you’re just looking to run a single Solr node in a single
>> > > container, right?
>> > >
>> > > Just set the ZK_HOST environment variable, and remove the command-line
>> > > arguments.
>> > > And you don’t need to specify the port number unless you deviate from
>> the
>> > > default.
>> > > Have a look at this example
>> > >
>> >
>> https://github.com/docker-solr/docker-solr-examples/blob/master/swarm/docker-compose.yml
>> > > <
>> > >
>> >
>> https://github.com/docker-solr/docker-solr-examples/blob/master/swarm/docker-compose.yml#L61with
>> > > >
>> > >
>> > > The “start” command starts Solr in the background, which is typically
>> not
>> > > what you want
>> > > when running Solr under docker.
>> > >
>> > >
>> > > Why your command isn’t working as is, is not clear. When you say
>> you’re
>> > > using that
>> > > command-line, how do you actually do that? In a full docker command
>> line,
>> > > or a compose file, or from a “docker exec”, or from some orchestrator.
>> > > Share the exact thing you’re doing; perhaps there is mistake there.
>> > > Also, run `ps -eflww` in the container to see what command-line
>> arguments
>> > > the JVM actually got started with.
>> > > And share the full startup log somewhere (in a GitHub gist perhaps),
>> > there
>> > > might be something of interest earlier on.
>> > >
>> > > >> (running `echo ruok | nc zk1 2181` returns the expected "imok"
>> > response
>> > > >> from ZK within the docker container where Solr is located)
>> > > >> * The netcat command mentioned above shows up in the ZK logs, but
>> the
>> > > Solr
>> > > >> attempts to connect do not (it's like the request isn't even
>> getting
>> > to
>> > > ZK)
>> > >
>> > > Then it doesn’t sound like a environmental
>> > firewall/security-group/routing
>> > > issue.
>> > > Next step to debug then could be to check if you actually see Solr
>> make
>> > > tcp connections
>> > > to port 2181, in the Solr container, using tcpdump/sysdig/netstat or
>> some
>> > > such.
>> > > If that gives a negative result, then you know it’s an issue in your
>> Solr
>> > > invocation config, or name resolution.
>> > > If that gives a positive result, then it’s environmental after all;
>> and
>> > > you can dig further.
>> > >
>> > >
>> > > But try the ZK_HOST thing first; it may just fix it.
>> > >
>> > > — Martijn
>> >
>>
>

Reply via email to