Re: Problem in rolling upgrade since 1.12

2020-06-08 Thread Alberto Gomez
Hi,

I attach a diff for the modified test case in case you would like to use it to 
check the problem I mentioned.

BR,

Alberto

From: Alberto Gomez 
Sent: Saturday, June 6, 2020 4:06 PM
To: dev@geode.apache.org 
Subject: Problem in rolling upgrade since 1.12

Hi,

I have observed that since version 1.12 rolling upgrades to future versions 
leave the first upgraded locator "as if" it was still on version 1.12.

This is the output from "list members" before starting the upgrade from version 
1.12:

Name | Id
 | ---
vm2  | 192.168.0.37(vm2:29367:locator):41001 [Coordinator]
vm0  | 192.168.0.37(vm0:29260):41002
vm1  | 192.168.0.37(vm1:29296):41003


And this is the output from "list members" after upgrading the first locator 
from 1.12 to 1.13/1.14:

Name | Id
 | 

vm2  | 192.168.0.37(vm2:1453:locator):41001(version:GEODE 1.12.0) 
[Coordinator]
vm0  | 192.168.0.37(vm0:810):41002(version:GEODE 1.12.0)
vm1  | 192.168.0.37(vm1:849):41003(version:GEODE 1.12.0)


Finally this is the output in gfsh once the rolling upgrade has been completed 
(locators and servers upgraded):

Name | Id
 | 

vm2  | 192.168.0.37(vm2:1453:locator):41001(version:GEODE 1.12.0) 
[Coordinator]
vm0  | 192.168.0.37(vm0:2457):41002
vm1  | 192.168.0.37(vm1:2576):41003

I verified this by running manual tests and also by running the following 
upgrade test (had to stop it in the middle to connect via gfsh and get the gfsh 
outputs):
RollingUpgradeRollServersOnPartitionedRegion_dataserializable.testRollServersOnPartitionedRegion_dataserializable

After the rolling upgrade, the shutdown command fails with the following error:
Member 192.168.0.37(vm2:1453:locator):41001 could not be found.  Please 
verify the member name or ID and try again.

The only way I have found to come out of the situation is by restarting the 
locator.
Once restarted again, the output of gfsh shows that all members are upgraded to 
the new version, i.e. the locator does not show anymore that it is on version 
GEODE 1.12.0.

Anybody has any clue why this is happening?

Thanks in advance,

/Alberto G.
diff --git a/geode-core/src/upgradeTest/java/org/apache/geode/internal/cache/rollingupgrade/RollingUpgradeDUnitTest.java b/geode-core/src/upgradeTest/java/org/apache/geode/internal/cache/rollingupgrade/RollingUpgradeDUnitTest.java
index 089b4ffd9d..a501d89a9f 100644
--- a/geode-core/src/upgradeTest/java/org/apache/geode/internal/cache/rollingupgrade/RollingUpgradeDUnitTest.java
+++ b/geode-core/src/upgradeTest/java/org/apache/geode/internal/cache/rollingupgrade/RollingUpgradeDUnitTest.java
@@ -14,6 +14,10 @@
  */
 package org.apache.geode.internal.cache.rollingupgrade;
 
+import static org.apache.geode.distributed.ConfigurationProperties.JMX_MANAGER;
+import static org.apache.geode.distributed.ConfigurationProperties.JMX_MANAGER_PORT;
+import static org.apache.geode.distributed.ConfigurationProperties.JMX_MANAGER_START;
+import static org.apache.geode.internal.AvailablePortHelper.getRandomAvailableTCPPorts;
 import static org.apache.geode.test.awaitility.GeodeAwaitility.await;
 import static org.junit.Assert.assertTrue;
 
@@ -27,6 +31,7 @@ import java.util.List;
 import java.util.Properties;
 
 import org.apache.commons.io.FileUtils;
+import org.junit.Rule;
 import org.junit.runner.RunWith;
 import org.junit.runners.Parameterized;
 import org.junit.runners.Parameterized.Parameter;
@@ -50,7 +55,6 @@ import org.apache.geode.distributed.internal.DistributionConfig;
 import org.apache.geode.distributed.internal.InternalLocator;
 import org.apache.geode.distributed.internal.membership.InternalDistributedMember;
 import org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave;
-import org.apache.geode.internal.AvailablePortHelper;
 import org.apache.geode.internal.serialization.Version;
 import org.apache.geode.test.dunit.DistributedTestUtils;
 import org.apache.geode.test.dunit.Host;
@@ -59,6 +63,7 @@ import org.apache.geode.test.dunit.Invoke;
 import org.apache.geode.test.dunit.NetworkUtils;
 import org.apache.geode.test.dunit.VM;
 import org.apache.geode.test.dunit.internal.JUnit4DistributedTestCase;
+import org.apache.geode.test.junit.rules.GfshCommandRule;
 import org.apache.geode.test.junit.runners.CategoryWithParameterizedRunnerFactory;
 import org.apache.geode.test.version.VersionManager;
 
@@ -78,10 +83,14 @@ import org.apache.geode.test.version.VersionManager;
  * @author jhuynh
  */
 
+
 @RunWith(Parameterized.class)
 @UseParametersRunnerFactory(CategoryWithParameterizedRunnerFactory.class)
 public abstract class RollingUpgradeDUnitTest extends JUnit4DistributedTestCase {
 
+  @Rule
+  public transient GfshCommandRule gfsh = new GfshCommandRule();
+
   @Parameters(name = "from_v{0}")
   public s

Odg: Problem in rolling upgrade since 1.12

2020-06-08 Thread Mario Ivanac
Hi,

as I see in 1.12 release, there were bigger impacts in membership part of code 
with:

"Move the membership code into a separate gradle sub-project"

https://issues.apache.org/jira/browse/GEODE-6883

Maybe to check is this connected to reported bug.

BR,
Mario

Šalje: Alberto Gomez 
Poslano: 8. lipnja 2020. 9:29
Prima: dev@geode.apache.org 
Predmet: Re: Problem in rolling upgrade since 1.12

Hi,

I attach a diff for the modified test case in case you would like to use it to 
check the problem I mentioned.

BR,

Alberto

From: Alberto Gomez 
Sent: Saturday, June 6, 2020 4:06 PM
To: dev@geode.apache.org 
Subject: Problem in rolling upgrade since 1.12

Hi,

I have observed that since version 1.12 rolling upgrades to future versions 
leave the first upgraded locator "as if" it was still on version 1.12.

This is the output from "list members" before starting the upgrade from version 
1.12:

Name | Id
 | ---
vm2  | 192.168.0.37(vm2:29367:locator):41001 [Coordinator]
vm0  | 192.168.0.37(vm0:29260):41002
vm1  | 192.168.0.37(vm1:29296):41003


And this is the output from "list members" after upgrading the first locator 
from 1.12 to 1.13/1.14:

Name | Id
 | 

vm2  | 192.168.0.37(vm2:1453:locator):41001(version:GEODE 1.12.0) 
[Coordinator]
vm0  | 192.168.0.37(vm0:810):41002(version:GEODE 1.12.0)
vm1  | 192.168.0.37(vm1:849):41003(version:GEODE 1.12.0)


Finally this is the output in gfsh once the rolling upgrade has been completed 
(locators and servers upgraded):

Name | Id
 | 

vm2  | 192.168.0.37(vm2:1453:locator):41001(version:GEODE 1.12.0) 
[Coordinator]
vm0  | 192.168.0.37(vm0:2457):41002
vm1  | 192.168.0.37(vm1:2576):41003

I verified this by running manual tests and also by running the following 
upgrade test (had to stop it in the middle to connect via gfsh and get the gfsh 
outputs):
RollingUpgradeRollServersOnPartitionedRegion_dataserializable.testRollServersOnPartitionedRegion_dataserializable

After the rolling upgrade, the shutdown command fails with the following error:
Member 192.168.0.37(vm2:1453:locator):41001 could not be found.  Please 
verify the member name or ID and try again.

The only way I have found to come out of the situation is by restarting the 
locator.
Once restarted again, the output of gfsh shows that all members are upgraded to 
the new version, i.e. the locator does not show anymore that it is on version 
GEODE 1.12.0.

Anybody has any clue why this is happening?

Thanks in advance,

/Alberto G.


Re: Problem in rolling upgrade since 1.12

2020-06-08 Thread Ernie Burghardt
Hi Alberto,

I’m looking at this, came across a couple blockers…
Do you have branch that exhibits this problem? Draft PR maybe?
I tried to apply you patch to latest develop, but the patch doesn’t pass git 
apply’s check….
Also these tests pass on develop, would you be able to check against the latest 
and update the diff?
I’m very interested in reproducing the issue you have observed.

Thanks,
Ernie

From: Alberto Gomez 
Reply-To: "dev@geode.apache.org" 
Date: Monday, June 8, 2020 at 12:31 AM
To: "dev@geode.apache.org" 
Subject: Re: Problem in rolling upgrade since 1.12

Hi,

I attach a diff for the modified test case in case you would like to use it to 
check the problem I mentioned.

BR,

Alberto

From: Alberto Gomez 
Sent: Saturday, June 6, 2020 4:06 PM
To: dev@geode.apache.org 
Subject: Problem in rolling upgrade since 1.12

Hi,

I have observed that since version 1.12 rolling upgrades to future versions 
leave the first upgraded locator "as if" it was still on version 1.12.

This is the output from "list members" before starting the upgrade from version 
1.12:

Name | Id
 | ---
vm2  | 192.168.0.37(vm2:29367:locator):41001 [Coordinator]
vm0  | 192.168.0.37(vm0:29260):41002
vm1  | 192.168.0.37(vm1:29296):41003


And this is the output from "list members" after upgrading the first locator 
from 1.12 to 1.13/1.14:

Name | Id
 | 

vm2  | 192.168.0.37(vm2:1453:locator):41001(version:GEODE 1.12.0) 
[Coordinator]
vm0  | 192.168.0.37(vm0:810):41002(version:GEODE 1.12.0)
vm1  | 192.168.0.37(vm1:849):41003(version:GEODE 1.12.0)


Finally this is the output in gfsh once the rolling upgrade has been completed 
(locators and servers upgraded):

Name | Id
 | 

vm2  | 192.168.0.37(vm2:1453:locator):41001(version:GEODE 1.12.0) 
[Coordinator]
vm0  | 192.168.0.37(vm0:2457):41002
vm1  | 192.168.0.37(vm1:2576):41003

I verified this by running manual tests and also by running the following 
upgrade test (had to stop it in the middle to connect via gfsh and get the gfsh 
outputs):
RollingUpgradeRollServersOnPartitionedRegion_dataserializable.testRollServersOnPartitionedRegion_dataserializable

After the rolling upgrade, the shutdown command fails with the following error:
Member 192.168.0.37(vm2:1453:locator):41001 could not be found.  Please 
verify the member name or ID and try again.

The only way I have found to come out of the situation is by restarting the 
locator.
Once restarted again, the output of gfsh shows that all members are upgraded to 
the new version, i.e. the locator does not show anymore that it is on version 
GEODE 1.12.0.

Anybody has any clue why this is happening?

Thanks in advance,

/Alberto G.


Re: Problem in rolling upgrade since 1.12

2020-06-08 Thread Alberto Gomez
Hi Ernie,

I have seen this problem in the support/1.13 branch and also on develop.

Interestingly, the patch I sent is applied seamlessly in my local repo set to 
the develop branch.

The patch modifies the 
RollingUpgradeRollServersOnPartitionedRegion_dataserializable test case by 
running "list members" on an upgraded system is 
RollingUpgradeRollServersOnPartitionedRegion_dataserializable. I run it 
manually with the following command:

./gradlew geode-core:upgradeTest 
--tests=RollingUpgradeRollServersOnPartitionedRegion_dataserializable

I see it failing when upgrading from 1.12.

I created a draft PR where you can see also the changes in the test case that 
manifest the problem.

See: https://github.com/apache/geode/pull/5224


Please, let me know if you need any more information.

BR,

Alberto

From: Ernie Burghardt 
Sent: Monday, June 8, 2020 9:04 PM
To: dev@geode.apache.org 
Subject: Re: Problem in rolling upgrade since 1.12

Hi Alberto,

I’m looking at this, came across a couple blockers…
Do you have branch that exhibits this problem? Draft PR maybe?
I tried to apply you patch to latest develop, but the patch doesn’t pass git 
apply’s check….
Also these tests pass on develop, would you be able to check against the latest 
and update the diff?
I’m very interested in reproducing the issue you have observed.

Thanks,
Ernie

From: Alberto Gomez 
Reply-To: "dev@geode.apache.org" 
Date: Monday, June 8, 2020 at 12:31 AM
To: "dev@geode.apache.org" 
Subject: Re: Problem in rolling upgrade since 1.12

Hi,

I attach a diff for the modified test case in case you would like to use it to 
check the problem I mentioned.

BR,

Alberto

From: Alberto Gomez 
Sent: Saturday, June 6, 2020 4:06 PM
To: dev@geode.apache.org 
Subject: Problem in rolling upgrade since 1.12

Hi,

I have observed that since version 1.12 rolling upgrades to future versions 
leave the first upgraded locator "as if" it was still on version 1.12.

This is the output from "list members" before starting the upgrade from version 
1.12:

Name | Id
 | ---
vm2  | 192.168.0.37(vm2:29367:locator):41001 [Coordinator]
vm0  | 192.168.0.37(vm0:29260):41002
vm1  | 192.168.0.37(vm1:29296):41003


And this is the output from "list members" after upgrading the first locator 
from 1.12 to 1.13/1.14:

Name | Id
 | 

vm2  | 192.168.0.37(vm2:1453:locator):41001(version:GEODE 1.12.0) 
[Coordinator]
vm0  | 192.168.0.37(vm0:810):41002(version:GEODE 1.12.0)
vm1  | 192.168.0.37(vm1:849):41003(version:GEODE 1.12.0)


Finally this is the output in gfsh once the rolling upgrade has been completed 
(locators and servers upgraded):

Name | Id
 | 

vm2  | 192.168.0.37(vm2:1453:locator):41001(version:GEODE 1.12.0) 
[Coordinator]
vm0  | 192.168.0.37(vm0:2457):41002
vm1  | 192.168.0.37(vm1:2576):41003

I verified this by running manual tests and also by running the following 
upgrade test (had to stop it in the middle to connect via gfsh and get the gfsh 
outputs):
RollingUpgradeRollServersOnPartitionedRegion_dataserializable.testRollServersOnPartitionedRegion_dataserializable

After the rolling upgrade, the shutdown command fails with the following error:
Member 192.168.0.37(vm2:1453:locator):41001 could not be found.  Please 
verify the member name or ID and try again.

The only way I have found to come out of the situation is by restarting the 
locator.
Once restarted again, the output of gfsh shows that all members are upgraded to 
the new version, i.e. the locator does not show anymore that it is on version 
GEODE 1.12.0.

Anybody has any clue why this is happening?

Thanks in advance,

/Alberto G.