[ 
https://issues.apache.org/jira/browse/GEODE-8730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17243444#comment-17243444
 ] 

Bill Burcham edited comment on GEODE-8730 at 12/3/20, 7:53 PM:
---------------------------------------------------------------

>From the IDE I ran the docker Gradle task in geode-assembly to create a fresh 
>Geode Docker image. 

Then from 
/Users/bburcham/Projects/geode/geode-assembly/src/acceptanceTest/resources/org/apache/geode/client/sni
 I ran "docker-compose up" and from the Docker app dashboard I opened a shell 
into the running ("geode") container.

Once in, I "apt-get update" and "apt-get install net-tools".

{noformat}
# netstat -lp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       
PID/Program name    
tcp        0      0 127.0.0.11:37221        0.0.0.0:*               LISTEN      
-                   
udp        0      0 127.0.0.11:42103        0.0.0.0:*                           
-                   
Active UNIX domain sockets (only servers)
Proto RefCnt Flags       Type       State         I-Node   PID/Program name     
Path
{noformat}


and then ran the gfsh startup script: "gfsh run 
--file=/geode/scripts/geode-starter-2.gfsh". For reference that file contains:


{noformat}
start locator --name=locator-maeve --connect=false --redirect-output 
--hostname-for-clients=locator-maeve 
--properties-file=/geode/config/gemfire.properties 
--security-properties-file=/geode/config/gfsecurity.properties 
--J=-Dgemfire.ssl-keystore=/geode/config/locator-maeve-keystore.jks
start server --name=server-dolores --group=group-dolores 
--hostname-for-clients=server-dolores --locators=geode[10334] 
--properties-file=/geode/config/gemfire.properties 
--security-properties-file=/geode/config/gfsecurity.properties 
--J=-Dgemfire.ssl-keystore=/geode/config/server-dolores-keystore.jks
start server --name=server-clementine --group=group-clementine 
--hostname-for-clients=server-clementine --server-port=40405 
--locators=geode[10334] --properties-file=/geode/config/gemfire.properties 
--security-properties-file=/geode/config/gfsecurity.properties 
--J=-Dgemfire.ssl-keystore=/geode/config/server-clementine-keystore.jks
connect --locator=geode[10334] --use-ssl=true 
--security-properties-file=/geode/config/gfsecurity.properties
create region --name=region-dolores --group=group-dolores --type=REPLICATE
create region --name=region-clementine --group=group-clementine --type=REPLICATE
{noformat}


{noformat}
# netstat -lp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       
PID/Program name    
tcp        0      0 geode:46867             0.0.0.0:*               LISTEN      
256/java            
tcp        0      0 geode:43540             0.0.0.0:*               LISTEN      
515/java            
tcp        0      0 0.0.0.0:40404           0.0.0.0:*               LISTEN      
419/java            
tcp        0      0 0.0.0.0:40405           0.0.0.0:*               LISTEN      
515/java            
tcp        0      0 0.0.0.0:40053           0.0.0.0:*               LISTEN      
515/java            
tcp        0      0 0.0.0.0:46649           0.0.0.0:*               LISTEN      
256/java            
tcp        0      0 geode:57053             0.0.0.0:*               LISTEN      
256/java            
tcp        0      0 geode:55518             0.0.0.0:*               LISTEN      
515/java            
tcp        0      0 geode:55486             0.0.0.0:*               LISTEN      
419/java            
tcp        0      0 0.0.0.0:7070            0.0.0.0:*               LISTEN      
256/java            
tcp        0      0 0.0.0.0:10334           0.0.0.0:*               LISTEN      
256/java            
tcp        0      0 0.0.0.0:33953           0.0.0.0:*               LISTEN      
419/java            
tcp        0      0 127.0.0.11:37221        0.0.0.0:*               LISTEN      
-                   
tcp        0      0 geode:48715             0.0.0.0:*               LISTEN      
419/java            
tcp        0      0 0.0.0.0:1099            0.0.0.0:*               LISTEN      
256/java            
udp        0      0 geode:41000             0.0.0.0:*                           
256/java            
udp        0      0 geode:41001             0.0.0.0:*                           
419/java            
udp        0      0 geode:41002             0.0.0.0:*                           
515/java            
udp        0      0 127.0.0.11:42103        0.0.0.0:*                           
-                   
Active UNIX domain sockets (only servers)
Proto RefCnt Flags       Type       State         I-Node   PID/Program name     
Path
unix  2      [ ACC ]     STREAM     LISTENING     155159   256/java             
/tmp/.java_pid256.tmp
unix  2      [ ACC ]     STREAM     LISTENING     158787   419/java             
/tmp/.java_pid419.tmp
unix  2      [ ACC ]     STREAM     LISTENING     159071   515/java             
/tmp/.java_pid515.tmp
{noformat}

Grouping these by PID: locator first, then cache servers:


{noformat}
tcp        0      0 0.0.0.0:10334           0.0.0.0:*               LISTEN      
256/java  for locator clients     
tcp        0      0 0.0.0.0:1099            0.0.0.0:*               LISTEN      
256/java  for gfsh
tcp        0      0 0.0.0.0:7070            0.0.0.0:*               LISTEN      
256/java  for browser (pulse)
tcp        0      0 0.0.0.0:46649           0.0.0.0:*               LISTEN      
256/java            
tcp        0      0 geode:46867             0.0.0.0:*               LISTEN      
256/java            
tcp        0      0 geode:57053             0.0.0.0:*               LISTEN      
256/java            
udp        0      0 geode:41000             0.0.0.0:*                           
256/java  for membership     


tcp        0      0 0.0.0.0:40404           0.0.0.0:*               LISTEN      
419/java   for client's cache
tcp        0      0 0.0.0.0:33953 ***       0.0.0.0:*               LISTEN      
419/java            
tcp        0      0 geode:55486             0.0.0.0:*               LISTEN      
419/java  for health
tcp        0      0 geode:48715             0.0.0.0:*               LISTEN      
419/java  for peer's cache         
udp        0      0 geode:41001             0.0.0.0:*                           
419/java  for membership


tcp        0      0 0.0.0.0:40405           0.0.0.0:*               LISTEN      
515/java  for client's cache
tcp        0      0 0.0.0.0:40053 ***       0.0.0.0:*               LISTEN      
515/java            
tcp        0      0 geode:43540             0.0.0.0:*               LISTEN      
515/java for peer's cache   
tcp        0      0 geode:55518             0.0.0.0:*               LISTEN      
515/java  for health  
udp        0      0 geode:41002             0.0.0.0:*                           
515/java  for membership
{noformat}

I've highlighted with "***" the two bindings that are odd. These are ephemeral 
ports but are not within the default (configured) port range 41000-61000. I 
expect these are different each time we run and are the cause of this bug.

I searched the logs for those ports and didn't find them. I wonder what those 
bindings are? A cache server binds these TCP ports:

* client's cache (40404, 40405 above)
* peer's cache ostensibly in port range (41000-61000)
* health monitoring also ostensibly in port range (41000-61000)

Of the three unknown TCP port bindings per cache server in the netstat output 
above we only have categories for two (peer's cache, health monitoring.) What's 
that third category?

In summary we have these two unexplained bindings (one per cache server) and we 
have the one unexplained TCP binding before Geode even starts (see first 
netstat above.)

A jstack (stack dump) showed that RMI is the culprit for those unexplained 
cache server ports. "jstack 419 | less" showed:


{noformat}
"RMI TCP Accept-0" #26 daemon prio=9 os_prio=0 cpu=3.69ms elapsed=5311.36s 
tid=0x00007fbe28005800 nid=0x1c5 runnable  [0x00007fbe38862000]
   java.lang.Thread.State: RUNNABLE
        at java.net.PlainSocketImpl.socketAccept(java.base@11.0.9.1/Native 
Method)
        at 
java.net.AbstractPlainSocketImpl.accept(java.base@11.0.9.1/AbstractPlainSocketImpl.java:458)
        at 
java.net.ServerSocket.implAccept(java.base@11.0.9.1/ServerSocket.java:565)
        at 
java.net.ServerSocket.accept(java.base@11.0.9.1/ServerSocket.java:533)
        at 
sun.management.jmxremote.LocalRMIServerSocketFactory$1.accept(jdk.management.agent@11.0.9.1/LocalRMIServerSocketFactory.java:52)
        at 
sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(java.rmi@11.0.9.1/TCPTransport.java:394)
        at 
sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(java.rmi@11.0.9.1/TCPTransport.java:366)
        at java.lang.Thread.run(java.base@11.0.9.1/Thread.java:834)
{noformat}


I'll see if there is a way to lock that down. And I'll see if/how that port 
37221 that is bound before Geode starts, changes next time I spin up a 
container.


was (Author: bburcham):
>From the IDE I ran the docker Gradle task in geode-assembly to create a fresh 
>Geode Docker image. 

Then from 
/Users/bburcham/Projects/geode/geode-assembly/src/acceptanceTest/resources/org/apache/geode/client/sni
 I ran "docker-compose up" and from the Docker app dashboard I opened a shell 
into the running ("geode") container.

Once in, I "apt-get update" and "apt-get install net-tools".

{noformat}
# netstat -lp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       
PID/Program name    
tcp        0      0 127.0.0.11:37221        0.0.0.0:*               LISTEN      
-                   
udp        0      0 127.0.0.11:42103        0.0.0.0:*                           
-                   
Active UNIX domain sockets (only servers)
Proto RefCnt Flags       Type       State         I-Node   PID/Program name     
Path
{noformat}


and then ran the gfsh startup script: "gfsh run 
--file=/geode/scripts/geode-starter-2.gfsh". For reference that file contains:


{noformat}
start locator --name=locator-maeve --connect=false --redirect-output 
--hostname-for-clients=locator-maeve 
--properties-file=/geode/config/gemfire.properties 
--security-properties-file=/geode/config/gfsecurity.properties 
--J=-Dgemfire.ssl-keystore=/geode/config/locator-maeve-keystore.jks
start server --name=server-dolores --group=group-dolores 
--hostname-for-clients=server-dolores --locators=geode[10334] 
--properties-file=/geode/config/gemfire.properties 
--security-properties-file=/geode/config/gfsecurity.properties 
--J=-Dgemfire.ssl-keystore=/geode/config/server-dolores-keystore.jks
start server --name=server-clementine --group=group-clementine 
--hostname-for-clients=server-clementine --server-port=40405 
--locators=geode[10334] --properties-file=/geode/config/gemfire.properties 
--security-properties-file=/geode/config/gfsecurity.properties 
--J=-Dgemfire.ssl-keystore=/geode/config/server-clementine-keystore.jks
connect --locator=geode[10334] --use-ssl=true 
--security-properties-file=/geode/config/gfsecurity.properties
create region --name=region-dolores --group=group-dolores --type=REPLICATE
create region --name=region-clementine --group=group-clementine --type=REPLICATE
{noformat}


{noformat}
# netstat -lp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       
PID/Program name    
tcp        0      0 geode:46867             0.0.0.0:*               LISTEN      
256/java            
tcp        0      0 geode:43540             0.0.0.0:*               LISTEN      
515/java            
tcp        0      0 0.0.0.0:40404           0.0.0.0:*               LISTEN      
419/java            
tcp        0      0 0.0.0.0:40405           0.0.0.0:*               LISTEN      
515/java            
tcp        0      0 0.0.0.0:40053           0.0.0.0:*               LISTEN      
515/java            
tcp        0      0 0.0.0.0:46649           0.0.0.0:*               LISTEN      
256/java            
tcp        0      0 geode:57053             0.0.0.0:*               LISTEN      
256/java            
tcp        0      0 geode:55518             0.0.0.0:*               LISTEN      
515/java            
tcp        0      0 geode:55486             0.0.0.0:*               LISTEN      
419/java            
tcp        0      0 0.0.0.0:7070            0.0.0.0:*               LISTEN      
256/java            
tcp        0      0 0.0.0.0:10334           0.0.0.0:*               LISTEN      
256/java            
tcp        0      0 0.0.0.0:33953           0.0.0.0:*               LISTEN      
419/java            
tcp        0      0 127.0.0.11:37221        0.0.0.0:*               LISTEN      
-                   
tcp        0      0 geode:48715             0.0.0.0:*               LISTEN      
419/java            
tcp        0      0 0.0.0.0:1099            0.0.0.0:*               LISTEN      
256/java            
udp        0      0 geode:41000             0.0.0.0:*                           
256/java            
udp        0      0 geode:41001             0.0.0.0:*                           
419/java            
udp        0      0 geode:41002             0.0.0.0:*                           
515/java            
udp        0      0 127.0.0.11:42103        0.0.0.0:*                           
-                   
Active UNIX domain sockets (only servers)
Proto RefCnt Flags       Type       State         I-Node   PID/Program name     
Path
unix  2      [ ACC ]     STREAM     LISTENING     155159   256/java             
/tmp/.java_pid256.tmp
unix  2      [ ACC ]     STREAM     LISTENING     158787   419/java             
/tmp/.java_pid419.tmp
unix  2      [ ACC ]     STREAM     LISTENING     159071   515/java             
/tmp/.java_pid515.tmp
{noformat}

Grouping these by PID: locator first, then cache servers:


{noformat}
tcp        0      0 0.0.0.0:10334           0.0.0.0:*               LISTEN      
256/java  for locator clients     
tcp        0      0 0.0.0.0:1099            0.0.0.0:*               LISTEN      
256/java  for gfsh
tcp        0      0 0.0.0.0:7070            0.0.0.0:*               LISTEN      
256/java  for browser (pulse)
tcp        0      0 0.0.0.0:46649           0.0.0.0:*               LISTEN      
256/java            
tcp        0      0 geode:46867             0.0.0.0:*               LISTEN      
256/java            
tcp        0      0 geode:57053             0.0.0.0:*               LISTEN      
256/java            
udp        0      0 geode:41000             0.0.0.0:*                           
256/java  for membership     


tcp        0      0 0.0.0.0:40404           0.0.0.0:*               LISTEN      
419/java   for client's cache
tcp        0      0 0.0.0.0:33953 ***       0.0.0.0:*               LISTEN      
419/java            
tcp        0      0 geode:55486             0.0.0.0:*               LISTEN      
419/java  for health
tcp        0      0 geode:48715             0.0.0.0:*               LISTEN      
419/java  for peer's cache         
udp        0      0 geode:41001             0.0.0.0:*                           
419/java  for membership


tcp        0      0 0.0.0.0:40405           0.0.0.0:*               LISTEN      
515/java  for client's cache
tcp        0      0 0.0.0.0:40053 ***       0.0.0.0:*               LISTEN      
515/java            
tcp        0      0 geode:43540             0.0.0.0:*               LISTEN      
515/java for peer's cache   
tcp        0      0 geode:55518             0.0.0.0:*               LISTEN      
515/java  for health  
udp        0      0 geode:41002             0.0.0.0:*                           
515/java  for membership
{noformat}

I've highlighted with "***" the two bindings that are odd. These are ephemeral 
ports but are not within the default (configured) port range 41000-61000. I 
expect these are different each time we run and are the cause of this bug.

I searched the logs for those ports and didn't find them. I wonder what those 
bindings are? A cache server binds these TCP ports:

* client's cache (40404, 40405 above)
* peer's cache ostensibly in port range (41000-61000)
* health monitoring also ostensibly in port range (41000-61000)

Of the three unknown TCP port bindings per cache server in the netstat output 
above we only have categories for two (peer's cache, health monitoring.) What's 
that third category?

In summary we have these two unexplained bindings (one per cache server) and we 
have the one unexplained TCP binding before Geode even starts (see first 
netstat above.)

A jstack (stack dump) showed that RMI is the culprit for those unexplained 
cache server ports. I'll see if there is a way to lock that down. And I'll see 
if/how that port 37221 that is bound before Geode starts, changes next time I 
spin up a container.

> CI failure: DualServerSNIAcceptanceTest fails to start server because port is 
> in use
> ------------------------------------------------------------------------------------
>
>                 Key: GEODE-8730
>                 URL: https://issues.apache.org/jira/browse/GEODE-8730
>             Project: Geode
>          Issue Type: Bug
>          Components: membership
>            Reporter: Darrel Schneider
>            Assignee: Bill Burcham
>            Priority: Major
>
> The run is here: 
> [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/AcceptanceTestOpenJDK8/builds/587]
> {noformat}
> org.apache.geode.client.sni.DualServerSNIAcceptanceTest > classMethod FAILED
>     com.palantir.docker.compose.execution.DockerExecutionException: 
> 'docker-compose exec -T geode gfsh run 
> --file=/geode/scripts/geode-starter-2.gfsh' returned exit code 1
>     The output was:
>     1. Executing - start locator --name=locator-maeve --connect=false 
> --redirect-output --hostname-for-clients=locator-maeve 
> --properties-file=/geode/config/gemfire.properties 
> --security-properties-file=******** 
> --J=-Dgemfire.ssl-keystore=/geode/config/locator-maeve-keystore.jks
>     ...........................
>     Locator in /locator-maeve on geode[10334] as locator-maeve is currently 
> online.
>     Process ID: 47
>     Uptime: 16 seconds
>     Geode Version: 1.14.0-build.0
>     Java Version: 11.0.9.1
>     Log File: /locator-maeve/locator-maeve.log
>     JVM Arguments: -DgemfirePropertyFile=/geode/config/gemfire.properties 
> -DgemfireSecurityPropertyFile=/geode/config/gfsecurity.properties 
> -Dgemfire.enable-cluster-configuration=true 
> -Dgemfire.load-cluster-configuration-from-dir=false 
> -Dgemfire.ssl-keystore=/geode/config/locator-maeve-keystore.jks 
> -Dgemfire.launcher.registerSignalHandlers=true -Djava.awt.headless=true 
> -Dsun.rmi.dgc.server.gcInterval=9223372036854775806 
> -Dgemfire.OSProcess.DISABLE_REDIRECTION_CONFIGURATION=true
>     Class-Path: 
> /geode/lib/geode-core-1.14.0-build.0.jar:/geode/lib/geode-dependencies.jar
>     2. Executing - start server --name=server-dolores --group=group-dolores 
> --hostname-for-clients=server-dolores --locators=geode[10334] 
> --properties-file=/geode/config/gemfire.properties 
> --security-properties-file=******** 
> --J=-Dgemfire.ssl-keystore=/geode/config/server-dolores-keystore.jks
>     .......
>     Server in /server-dolores on geode[40404] as server-dolores is currently 
> online.
>     Process ID: 199
>     Uptime: 5 seconds
>     Geode Version: 1.14.0-build.0
>     Java Version: 11.0.9.1
>     Log File: /server-dolores/server-dolores.log
>     JVM Arguments: -DgemfirePropertyFile=/geode/config/gemfire.properties 
> -DgemfireSecurityPropertyFile=/geode/config/gfsecurity.properties 
> -Dgemfire.start-dev-rest-api=false -Dgemfire.locators=geode[10334] 
> -Dgemfire.use-cluster-configuration=true -Dgemfire.groups=group-dolores 
> -Dgemfire.ssl-keystore=/geode/config/server-dolores-keystore.jks 
> -Dgemfire.launcher.registerSignalHandlers=true -Djava.awt.headless=true 
> -Dsun.rmi.dgc.server.gcInterval=9223372036854775806
>     Class-Path: 
> /geode/lib/geode-core-1.14.0-build.0.jar:/geode/lib/geode-dependencies.jar
>     3. Executing - start server --name=server-clementine 
> --group=group-clementine --hostname-for-clients=server-clementine 
> --server-port=40405 --locators=geode[10334] 
> --properties-file=/geode/config/gemfire.properties 
> --security-properties-file=******** 
> --J=-Dgemfire.ssl-keystore=/geode/config/server-clementine-keystore.jks
>     ......The Cache Server process terminated unexpectedly with exit status 
> 1. Please refer to the log file in /server-clementine for full details.
>     Exception in thread "main" java.lang.RuntimeException: An IO error 
> occurred while starting a Server in /server-clementine on geode[40405]: 
> Network is unreachable; port (40405) is not available on localhost.
>       at 
> org.apache.geode.distributed.ServerLauncher.start(ServerLauncher.java:852)
>       at 
> org.apache.geode.distributed.ServerLauncher.run(ServerLauncher.java:737)
>       at 
> org.apache.geode.distributed.ServerLauncher.main(ServerLauncher.java:256)
>     Caused by: java.net.BindException: Network is unreachable; port (40405) 
> is not available on localhost.
>       at 
> org.apache.geode.distributed.AbstractLauncher.assertPortAvailable(AbstractLauncher.java:142)
>       at 
> org.apache.geode.distributed.ServerLauncher.start(ServerLauncher.java:794)
>       ... 2 more
>     ************************* Execution Summary ***********************
>     Script file: /geode/scripts/geode-starter-2.gfsh
>     Command-1 : start locator --name=locator-maeve --connect=false 
> --redirect-output --hostname-for-clients=locator-maeve 
> --properties-file=/geode/config/gemfire.properties 
> --security-properties-file=/geode/config/gfsecurity.properties 
> --J=-Dgemfire.ssl-keystore=/geode/config/locator-maeve-keystore.jks
>     Status    : PASSED
>     Command-2 : start server --name=server-dolores --group=group-dolores 
> --hostname-for-clients=server-dolores --locators=geode[10334] 
> --properties-file=/geode/config/gemfire.properties 
> --security-properties-file=/geode/config/gfsecurity.properties 
> --J=-Dgemfire.ssl-keystore=/geode/config/server-dolores-keystore.jks
>     Status    : PASSED
>     Command-3 : start server --name=server-clementine 
> --group=group-clementine --hostname-for-clients=server-clementine 
> --server-port=40405 --locators=geode[10334] 
> --properties-file=/geode/config/gemfire.properties 
> --security-properties-file=/geode/config/gfsecurity.properties 
> --J=-Dgemfire.ssl-keystore=/geode/config/server-clementine-keystore.jks
>     Status    : FAILED
>         at 
> com.palantir.docker.compose.execution.Command.lambda$throwingOnError$12(Command.java:60)
>         at 
> com.palantir.docker.compose.execution.Command.execute(Command.java:50)
>         at 
> com.palantir.docker.compose.execution.DefaultDockerCompose.exec(DefaultDockerCompose.java:122)
>         at 
> com.palantir.docker.compose.execution.DelegatingDockerCompose.exec(DelegatingDockerCompose.java:86)
>         at 
> com.palantir.docker.compose.execution.RetryingDockerCompose.exec(RetryingDockerCompose.java:22)
>         at 
> com.palantir.docker.compose.DockerComposeRule.exec(DockerComposeRule.java:171)
>         at 
> org.apache.geode.client.sni.DualServerSNIAcceptanceTest.beforeClass(DualServerSNIAcceptanceTest.java:77)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to