Re: [DISCUSS] Geode packages mustn't span Jigsaw modules
We can address package changes in the geode-old-client-support module. That module is integrated with our serialization for this very purpose. Most of our DataSerializable classes implement DataSerializableFixedID, so they can be repackaged w/o any ramifications. Classes that don't fit that category are mostly subclasses of Throwable or Function. On 11/26/18 3:24 PM, Kirk Lund wrote: One problem about moving around internal classes is that Geode uses (proprietary and Java-based) serialization on the wire instead of defining a wire-format that isn't dependent on class names/structures/packages. I for one would love to move to a real wire-format with a well-defined protocol but that's probably a separate project in its own right. Are you really convinced that we could move internal classes around without breaking rolling upgrades, client-server backwards compatibility and diskstore contents? On Mon, Nov 19, 2018 at 5:21 PM Dan Smith wrote: I think we can actually fix most of these issues without geode-2.0. Most of these are in internal packages, which means we can change the package of these classes without breaking users. The only concerning one is org.apache.geode.cache.util, which is a public package. However, the AutoBalancer is actually still marked @Experimental, so we could technically move that as well. Or maybe deprecate the old one and it's associated jar, and create a new jar with a reasonable package. JDK11 users could just avoid the old jar. I have been wondering for a while if we should just fold geode-cq and geode-wan back into geode-core. These are not really properly separated out, they are very tangled with core. The above package overlap kinda shows that as well. But maybe going through the effort of renaming the packages to make them java 11 modules would help improve the code anyway! -Dan On Mon, Nov 19, 2018 at 5:03 PM Owen Nichols wrote: To package Geode as Java 11 Jigsaw module(s), a major hurdle is the requirement that packages cannot be split across modules. If we simply map existing Geode jars to Modules, we have about 10 split packages (see table below). Any restructuring would potentially have to wait until Geode 2.0. A workaround would be to map existing packages into a small number of new modules (e.g. geode-api and geode-internal). Existing jars would remain as-is. Users making the transition to Java 11 /w Jigsaw already need to switch from CLASSPATH to MODULEPATH so the inconvenience of a different naming scheme for Geode-as-modules might be acceptable (however, once module naming and organization is established, we may be stuck with it for a long time). Thoughts? Is it even possible to rearrange all classes in each package below into a single jar? Is doing so desirable enough to defer Java 11 support until a yet-unplanned Geode 2.0, or perhaps to drive such a release? Or, is it satisfactory to fatten Geode releases to include one set of jars for CLASSPATH use, plus a different set of jars for MODULEPATH use? Package Jar org.apache.geode.cache.client.internal geode-core-1.8.0.jar geode-cq-1.8.0.jar geode-wan-1.8.0.jar org.apache.geode.cache.client.internal.locator.wan geode-core-1.8.0.jar geode-wan-1.8.0.jar org.apache.geode.cache.query.internal.cq geode-core-1.8.0.jar geode-cq-1.8.0.jar org.apache.geode.cache.util geode-core-1.8.0.jar geode-rebalancer-1.8.0.jar org.apache.geode.internal geode-connectors-1.8.0.jar geode-core-1.8.0.jar geode-cq-1.8.0.jar geode-lucene-1.8.0.jar geode-wan-1.8.0.jar org.apache.geode.internal.cache.tier.sockets.command geode-core-1.8.0.jar geode-cq-1.8.0.jar org.apache.geode.internal.cache.wan geode-core-1.8.0.jar geode-wan-1.8.0.jar org.apache.geode.internal.cache.wan.parallel geode-core-1.8.0.jar geode-wan-1.8.0.jar org.apache.geode.internal.cache.wan.serial geode-core-1.8.0.jar geode-wan-1.8.0.jar org.apache.geode.internal.protocol.protobuf.v1 geode-protobuf-1.8.0.jar geode-protobuf-messages-1.8.0.jar
Re: What is the new process for flaky test failures in precheckin?
For what it's worth, this specific failure (GEODE-6091) was a problem with the PR pipeline using an old test container image. This should have been corrected yesterday morning. On Mon, Nov 26, 2018 at 10:55 AM Helena Bales wrote: > For test failures that do not have a ticket for that particular stack > trace, you should re-trigger your pre-checkin. If the test fails again, > your change probably caused it to start failing, even if it doesn't seem > related (like the unit test ordering issue we had Friday before last), and > you would be expected to fix it before committing. > If the test passes your next run of tests the process is less established. > Ideally we would never encounter this case because it means we have checked > in flaky code at some point before your branch off of develop, which > StressNewTest should catch. We will probably find a few of these because > the StressNewTest job was silently failing for a little while. It is > working again, but we may have checked in some flaky tests during the time > it was down. > > I propose that we follow the CIO process with these. The process would be > as follows: > 1. Create a Jira ticket for the failure with links to the relevant > resources from the failing CI run. Include any evidence that you have > towards it being a flaky test that exists on develop and not just in your > branch, and evidence that it was not made flaky by your change. > 2. See if that test file was changed recently. If it was, talk to the > person that changed it. > 3. If it wasn't changed recently, post a link to gemfire-green-ci > 4. Comment on your pull request with an update on the status of the failing > test in your run. > > That is just an idea based on what we are doing for the develop pipeline. > It seems like we are having more failures in PR pipelines than we were > seeing a couple of weeks ago, and some tests that seem to only fail in the > PR pipelines, so it might be time to start tracking some of this stuff. My > main concern with this method is that it might pollute our backlog with > tickets that no one is ever going to look at. > > On Mon, Nov 26, 2018 at 10:19 AM Kirk Lund wrote: > > > I just saw SizingFlagDUnitTest fail in a precheckin but it passes on my > > branch when I run directly. I cannot find a Jira ticket for it. What's > the > > new process for handling these flickering tests? > > > > See: > > https://concourse.apachegeode-ci.info/builds/17745 > > > > Test failure stack: > > org.apache.geode.internal.cache.SizingFlagDUnitTest > > > testPRHeapLRUDeltaPutOnPrimary FAILED > > org.apache.geode.test.dunit.RMIException: While invoking > > org.apache.geode.internal.cache.SizingFlagDUnitTest$12.run in VM 0 > running > > on Host eb7aca4f2587 with 4 VMs > > at org.apache.geode.test.dunit.VM.invoke(VM.java:433) > > at org.apache.geode.test.dunit.VM.invoke(VM.java:402) > > at org.apache.geode.test.dunit.VM.invoke(VM.java:361) > > at > > > > > org.apache.geode.internal.cache.SizingFlagDUnitTest.assertValueType(SizingFlagDUnitTest.java:793) > > at > > > > > org.apache.geode.internal.cache.SizingFlagDUnitTest.doPRDeltaTestLRU(SizingFlagDUnitTest.java:312) > > at > > > > > org.apache.geode.internal.cache.SizingFlagDUnitTest.testPRHeapLRUDeltaPutOnPrimary(SizingFlagDUnitTest.java:220) > > > > Caused by: > > org.apache.geode.cache.EntryNotFoundException: Entry not found > for > > key 0 > > at > > > > > org.apache.geode.internal.cache.LocalRegion.checkEntryNotFound(LocalRegion.java:2760) > > at > > > > > org.apache.geode.internal.cache.LocalRegion.nonTXbasicGetValueInVM(LocalRegion.java:3448) > > at > > > > > org.apache.geode.internal.cache.LocalRegionDataView.getValueInVM(LocalRegionDataView.java:105) > > at > > > > > org.apache.geode.internal.cache.LocalRegion.basicGetValueInVM(LocalRegion.java:3436) > > at > > > > > org.apache.geode.internal.cache.LocalRegion.getValueInVM(LocalRegion.java:3424) > > at > > > > > org.apache.geode.internal.cache.PartitionedRegionDataStore.getLocalValueInVM(PartitionedRegionDataStore.java:2775) > > at > > > > > org.apache.geode.internal.cache.PartitionedRegion.getValueInVM(PartitionedRegion.java:8786) > > at > > > > > org.apache.geode.internal.cache.SizingFlagDUnitTest$12.run(SizingFlagDUnitTest.java:797) > > >