[ https://issues.apache.org/jira/browse/GEODE-9642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422528#comment-17422528 ]
Mario Ivanac edited comment on GEODE-9642 at 9/30/21, 5:14 AM: --------------------------------------------------------------- Steps to reproduce fault in smaller system: start locator --name=locator-ln --port=10332 --locators=localhost[10332] --mcast-port=0 --J=-Dgemfire.remote-locators=localhost[10331] --J=-Dgemfire.distributed-system-id=1 --J=-Dgemfire.jmx-manager-start=true --J=-Dgemfire.jmx-manager-http-port=8082 --J=-Dgemfire.jmx-manager-port=1092 configure pdx --read-serialized=true --disk-store=data start server --name=server11 --locators=localhost[10332] --mcast-port=0 --J=-XX:+UseG1GC --J=-Xms500m --J=-Xmx500m --server-port=40011 --J=-Dgemfire.ALLOW_PERSISTENT_TRANSACTIONS=true --J=-Dgemfire.disk.recoverValuesSync=true --off-heap-memory-size=512m --J=-Dgemfire.DEFAULT_MAX_OPLOG_SIZE=10 --J=-Dgemfire.EXPIRY_THREADS=1 --J=-Dgemfire.distributed-system-id=1 --J=-Dgemfire.conserve-sockets=false start server --name=server12 --locators=localhost[10332] --mcast-port=0 --J=-XX:+UseG1GC --J=-Xms500m --J=-Xmx500m --server-port=40012 --J=-Dgemfire.ALLOW_PERSISTENT_TRANSACTIONS=true --J=-Dgemfire.disk.recoverValuesSync=true --off-heap-memory-size=512m --J=-Dgemfire.DEFAULT_MAX_OPLOG_SIZE=10 --J=-Dgemfire.EXPIRY_THREADS=1 --J=-Dgemfire.distributed-system-id=1 --J=-Dgemfire.conserve-sockets=false start server --name=server13 --locators=localhost[10332] --mcast-port=0 --J=-XX:+UseG1GC --J=-Xms500m --J=-Xmx500m --server-port=40013 --J=-Dgemfire.ALLOW_PERSISTENT_TRANSACTIONS=true --J=-Dgemfire.disk.recoverValuesSync=true --off-heap-memory-size=512m --J=-Dgemfire.DEFAULT_MAX_OPLOG_SIZE=10 --J=-Dgemfire.EXPIRY_THREADS=1 --J=-Dgemfire.distributed-system-id=1 --J=-Dgemfire.conserve-sockets=false start server --name=server14 --locators=localhost[10332] --mcast-port=0 --J=-XX:+UseG1GC --J=-Xms500m --J=-Xmx500m --server-port=40014 --J=-Dgemfire.ALLOW_PERSISTENT_TRANSACTIONS=true --J=-Dgemfire.disk.recoverValuesSync=true --off-heap-memory-size=512m --J=-Dgemfire.DEFAULT_MAX_OPLOG_SIZE=10 --J=-Dgemfire.EXPIRY_THREADS=1 --J=-Dgemfire.distributed-system-id=1 --J=-Dgemfire.conserve-sockets=false create disk-store --name=data --max-oplog-size=10 --dir=. create region --name=/testregion --type=PARTITION_REDUNDANT_PERSISTENT --disk-store=data --total-num-buckets=13 query --query="select key,value from /testregion.entries" create gateway-sender --id=ln --remote-distributed-system-id=2 --enable-persistence=true --disk-store-name=data --parallel=true ##after all is up, execute command alter region --name=/testregion --gateway-sender-id=ln As a result, command hangs few minutes. This is a fault. was (Author: mivanac): Steps to reproduce fault in smaller system: start locator --name=locator-ln --port=10332 --locators=localhost[10332] --mcast-port=0 --J=-Dgemfire.remote-locators=localhost[10331] --J=-Dgemfire.distributed-system-id=1 --J=-Dgemfire.jmx-manager-start=true --J=-Dgemfire.jmx-manager-http-port=8082 --J=-Dgemfire.jmx-manager-port=1092 configure pdx --read-serialized=true --disk-store=data start server --name=server11 --locators=localhost[10332] --mcast-port=0 --J=-XX:+UseG1GC --J=-Xms500m --J=-Xmx500m --server-port=40011 --J=-Dgemfire.ALLOW_PERSISTENT_TRANSACTIONS=true --J=-Dgemfire.disk.recoverValuesSync=true --off-heap-memory-size=512m --J=-Dgemfire.DEFAULT_MAX_OPLOG_SIZE=10 --J=-Dgemfire.EXPIRY_THREADS=1 --J=-Dgemfire.distributed-system-id=1 --J=-Dgemfire.conserve-sockets=false start server --name=server12 --locators=localhost[10332] --mcast-port=0 --J=-XX:+UseG1GC --J=-Xms500m --J=-Xmx500m --server-port=40012 --J=-Dgemfire.ALLOW_PERSISTENT_TRANSACTIONS=true --J=-Dgemfire.disk.recoverValuesSync=true --off-heap-memory-size=512m --J=-Dgemfire.DEFAULT_MAX_OPLOG_SIZE=10 --J=-Dgemfire.EXPIRY_THREADS=1 --J=-Dgemfire.distributed-system-id=1 --J=-Dgemfire.conserve-sockets=false start server --name=server13 --locators=localhost[10332] --mcast-port=0 --J=-XX:+UseG1GC --J=-Xms500m --J=-Xmx500m --server-port=40013 --J=-Dgemfire.ALLOW_PERSISTENT_TRANSACTIONS=true --J=-Dgemfire.disk.recoverValuesSync=true --off-heap-memory-size=512m --J=-Dgemfire.DEFAULT_MAX_OPLOG_SIZE=10 --J=-Dgemfire.EXPIRY_THREADS=1 --J=-Dgemfire.distributed-system-id=1 --J=-Dgemfire.conserve-sockets=false start server --name=server14 --locators=localhost[10332] --mcast-port=0 --J=-XX:+UseG1GC --J=-Xms500m --J=-Xmx500m --server-port=40014 --J=-Dgemfire.ALLOW_PERSISTENT_TRANSACTIONS=true --J=-Dgemfire.disk.recoverValuesSync=true --off-heap-memory-size=512m --J=-Dgemfire.DEFAULT_MAX_OPLOG_SIZE=10 --J=-Dgemfire.EXPIRY_THREADS=1 --J=-Dgemfire.distributed-system-id=1 --J=-Dgemfire.conserve-sockets=false create disk-store --name=data --max-oplog-size=10 --dir=. create region --name=/testregion --type=PARTITION_REDUNDANT_PERSISTENT --disk-store=data --total-num-buckets=13 query --query="select key,value from /testregion.entries" create gateway-sender --id=ln --remote-distributed-system-id=2 --enable-persistence=true --disk-store-name=data --parallel=true # after all is up, execute command alter region --name=/testregion --gateway-sender-id=ln As a result, command hangs few minutes. This is a fault. > Adding GW sender to allready initialized partitioned region is hanging in > large cluster > --------------------------------------------------------------------------------------- > > Key: GEODE-9642 > URL: https://issues.apache.org/jira/browse/GEODE-9642 > Project: Geode > Issue Type: Bug > Components: regions, wan > Affects Versions: 1.13.0, 1.14.0 > Reporter: Mario Ivanac > Assignee: Mario Ivanac > Priority: Major > Labels: needsTriage, pull-request-available > > We have observed, that adding parallel GW sender to existing (allready > initialized) partitioned regions is hanging. > In case command alter-region is executed (attaching GW sender to initialized > region), it is hanging in cluster with more then 20 servers. > Execution of command in cluster with 16 or less servers was successful, but > if cluster is expanded to 20 or more, command is hanging. -- This message was sent by Atlassian Jira (v8.3.4#803005)