Requesting assignment permissions for Geode tickets
Hi, Could I get permissions to assign Geode tickets (especially to myself)? User name is nreich. Thanks, Nick
Re: Review Request 60935: GEODE-3115 Added changes to check for persistent region during pdx type registry.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/60935/#review180818 --- geode-core/src/test/java/org/apache/geode/pdx/PdxAttributesJUnitTest.java Lines 370 (patched) <https://reviews.apache.org/r/60935/#comment256088> Duplicate code as above test: consider refactoring out a helper method. geode-core/src/test/java/org/apache/geode/pdx/PdxAttributesJUnitTest.java Lines 377 (patched) <https://reviews.apache.org/r/60935/#comment256090> leftover println from debugging? geode-core/src/test/java/org/apache/geode/pdx/PdxAttributesJUnitTest.java Line 335 (original), 382 (patched) <https://reviews.apache.org/r/60935/#comment256099> JUnit has a good way to test for exceptions using rules (added in 4, so the try catch idiom was the only option back in JUnit 3): http://junit.org/junit4/javadoc/4.12/org/junit/rules/ExpectedException.html - Nick Reich On July 18, 2017, 1:20 a.m., anilkumar gingade wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/60935/ > --- > > (Updated July 18, 2017, 1:20 a.m.) > > > Review request for geode, Darrel Schneider, Eric Shu, Lynn Gallinat, and Nick > Reich. > > > Repository: geode > > > Description > --- > > The pdx registry needs to be persisted to use pdx with persistent region. > Currently while creating pdx the system checks, to see if there is a disk > store with data instead of looking for persistent region in the system. In > cases where persistent region is created and dropped, the system doesn't > allow creating pdx witout setting pdx persistence. > > > Diffs > - > > > geode-core/src/main/java/org/apache/geode/internal/cache/GemFireCacheImpl.java > 2dda38c > geode-core/src/main/java/org/apache/geode/internal/cache/InternalCache.java > 4c229db > > geode-core/src/main/java/org/apache/geode/internal/cache/xmlcache/CacheCreation.java > 61f91a0 > > geode-core/src/main/java/org/apache/geode/pdx/internal/PeerTypeRegistration.java > 065255b > > geode-core/src/test/java/org/apache/geode/internal/cache/partitioned/PersistentPartitionedRegionTestBase.java > 45c6120 > geode-core/src/test/java/org/apache/geode/pdx/PdxAttributesJUnitTest.java > ef15cd5 > > > Diff: https://reviews.apache.org/r/60935/diff/1/ > > > Testing > --- > > Added new test. > Precheckin started. > > > Thanks, > > anilkumar gingade > >
Re: Review Request 60935: GEODE-3115 Added changes to check for persistent region during pdx type registry.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/60935/#review181270 --- Ship it! Ship It! - Nick Reich On July 21, 2017, 7:18 p.m., anilkumar gingade wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/60935/ > --- > > (Updated July 21, 2017, 7:18 p.m.) > > > Review request for geode, Darrel Schneider, Eric Shu, Lynn Gallinat, and Nick > Reich. > > > Repository: geode > > > Description > --- > > The pdx registry needs to be persisted to use pdx with persistent region. > Currently while creating pdx the system checks, to see if there is a disk > store with data instead of looking for persistent region in the system. In > cases where persistent region is created and dropped, the system doesn't > allow creating pdx witout setting pdx persistence. > > > Diffs > - > > > geode-core/src/main/java/org/apache/geode/internal/cache/GemFireCacheImpl.java > de5fd88 > geode-core/src/main/java/org/apache/geode/internal/cache/InternalCache.java > aed439c > > geode-core/src/main/java/org/apache/geode/internal/cache/xmlcache/CacheCreation.java > db5f7ca > > geode-core/src/main/java/org/apache/geode/pdx/internal/PeerTypeRegistration.java > 065255b > geode-core/src/test/java/org/apache/geode/pdx/PdxAttributesJUnitTest.java > ef15cd5 > > > Diff: https://reviews.apache.org/r/60935/diff/2/ > > > Testing > --- > > Added new test. > Precheckin started. > > > Thanks, > > anilkumar gingade > >
Re: Review Request 61143: GEODE-3310 Set target node in TXStateProxy during TXFailover if necessary
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/61143/#review181480 --- geode-core/src/main/java/org/apache/geode/internal/cache/tier/sockets/command/TXFailoverCommand.java Line 64 (original), 64 (patched) <https://reviews.apache.org/r/61143/#comment257091> This does not really get a TXId, it creates one. Maybe getNewTXId() or createTXId() would be better names? geode-core/src/test/java/org/apache/geode/internal/cache/execute/PRTransactionDUnitTest.java Lines 178 (patched) <https://reviews.apache.org/r/61143/#comment257093> With "e" or "ex" being the most common names for Exceptions in java catch blocks and "exp" being an abbreviation for "expression", maybe Execution exec and Exception e would be better variable names? geode-core/src/test/java/org/apache/geode/internal/cache/execute/PRTransactionDUnitTest.java Lines 180 (patched) <https://reviews.apache.org/r/61143/#comment257094> assertFalse("Expected exception was not thrown", isFirstFunc)? geode-core/src/test/java/org/apache/geode/internal/cache/execute/PRTransactionDUnitTest.java Lines 184 (patched) <https://reviews.apache.org/r/61143/#comment257096> Possible alternative: catch (TransactionException e) { assertFalse("Unexpected exception was thrown", isFirstFunction) assertTrue(e.getMessage().startsWith("Function execution is not colocated with transaction.") } Letting unexpected exceptions be thrown from the test will result in them being logged and simplify the logic of the test (and not require manually logging that information as well). You could arguably remove the check for the contents of the exception message, as that makes the test brittle to changes that do not effect behavior (i.e. error message text changes). If the type of exception thrown is not sufficient, does this suggest we need a new exception that inherits from TransactionException? geode-core/src/test/java/org/apache/geode/internal/cache/execute/PRTransactionDUnitTest.java Line 149 (original), 223 (patched) <https://reviews.apache.org/r/61143/#comment257097> Replacing all these anonymous classes with lambdas really helps clean up the code: I am glad we are doing more of this. geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/command/TXFailoverCommandTest.java Lines 64 (patched) <https://reviews.apache.org/r/61143/#comment257098> Ouch! That is a massive amount of mocking and even power mocking. I do not envy you in trying to cobble that together. Is this easier/possible to do in an integration test (dunit or otherwise)? Whenever I create tests, the more mocking (and any power mocking) I do, the more I wonder how much I am really testing the code and not my ability to string together the right mock call and response sequences, though if it is the only way to get it done, it is still better than nothing. - Nick Reich On July 26, 2017, 5:37 p.m., Eric Shu wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/61143/ > --- > > (Updated July 26, 2017, 5:37 p.m.) > > > Review request for geode, anilkumar gingade, Darrel Schneider, Lynn Gallinat, > and Nick Reich. > > > Bugs: GEODE-3310 > https://issues.apache.org/jira/browse/GEODE-3310 > > > Repository: geode > > > Description > --- > > Set target node in TXStateProxy during TXFailover if necessary > > > Diffs > - > > > geode-core/src/main/java/org/apache/geode/internal/cache/tier/sockets/command/TXFailoverCommand.java > 6bd00c0 > > geode-core/src/test/java/org/apache/geode/internal/cache/execute/MyTransactionFunction.java > 0970836 > > geode-core/src/test/java/org/apache/geode/internal/cache/execute/PRTransactionDUnitTest.java > 7a56644 > > geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/command/TXFailoverCommandTest.java > PRE-CREATION > > > Diff: https://reviews.apache.org/r/61143/diff/1/ > > > Testing > --- > > Ongoing precheckin. > > > Thanks, > > Eric Shu > >
Re: Review Request 61143: GEODE-3310 Set target node in TXStateProxy during TXFailover if necessary
> On July 26, 2017, 8:11 p.m., Nick Reich wrote: > > All my comments are nits and only suggestions / thoughts and not barriers to accepting this review and shipping it. - Nick --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/61143/#review181480 --- On July 26, 2017, 5:37 p.m., Eric Shu wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/61143/ > --- > > (Updated July 26, 2017, 5:37 p.m.) > > > Review request for geode, anilkumar gingade, Darrel Schneider, Lynn Gallinat, > and Nick Reich. > > > Bugs: GEODE-3310 > https://issues.apache.org/jira/browse/GEODE-3310 > > > Repository: geode > > > Description > --- > > Set target node in TXStateProxy during TXFailover if necessary > > > Diffs > - > > > geode-core/src/main/java/org/apache/geode/internal/cache/tier/sockets/command/TXFailoverCommand.java > 6bd00c0 > > geode-core/src/test/java/org/apache/geode/internal/cache/execute/MyTransactionFunction.java > 0970836 > > geode-core/src/test/java/org/apache/geode/internal/cache/execute/PRTransactionDUnitTest.java > 7a56644 > > geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/command/TXFailoverCommandTest.java > PRE-CREATION > > > Diff: https://reviews.apache.org/r/61143/diff/1/ > > > Testing > --- > > Ongoing precheckin. > > > Thanks, > > Eric Shu > >
Re: Review Request 61143: GEODE-3310 Set target node in TXStateProxy during TXFailover if necessary
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/61143/#review181588 --- Ship it! Ship It! - Nick Reich On July 27, 2017, 5:32 p.m., Eric Shu wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/61143/ > --- > > (Updated July 27, 2017, 5:32 p.m.) > > > Review request for geode, anilkumar gingade, Darrel Schneider, Lynn Gallinat, > and Nick Reich. > > > Bugs: GEODE-3310 > https://issues.apache.org/jira/browse/GEODE-3310 > > > Repository: geode > > > Description > --- > > Set target node in TXStateProxy during TXFailover if necessary > > > Diffs > - > > > geode-core/src/main/java/org/apache/geode/internal/cache/tier/sockets/command/TXFailoverCommand.java > 6bd00c0 > > geode-core/src/test/java/org/apache/geode/internal/cache/execute/MyTransactionFunction.java > 0970836 > > geode-core/src/test/java/org/apache/geode/internal/cache/execute/PRTransactionDUnitTest.java > 7a56644 > > geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/command/TXFailoverCommandTest.java > PRE-CREATION > > > Diff: https://reviews.apache.org/r/61143/diff/2/ > > > Testing > --- > > Ongoing precheckin. > > > Thanks, > > Eric Shu > >
[DISCUSS] Improvements to backups
There is a desire to improve backup creation and restore. The suggested improvements are listed below and I am seeking feedback from the community: 1) Allow saving of backups to different locations/systems: currently, backups are saved to a directory on each member. Users can manually or through scripting move those backups elsewhere, but it would be advantageous to allow direct backups to cloud storage providers (amazon, google, azure, etc.) and possibly other systems. To make this possible, it is proposed to refactor backups into a service style architecture with backup location plugins that can be used to specify the target location. This would allow creation of additional backup strategies as demand is determined and allow users to create their own plugins for their own special use cases. 2) Changing backup restore procedure: backups create a restore script per member that must be run from each member to restore a backup to. The script created is based on the OS of the machine the backup is created on (it mainly moves files to the correct directories). A more flexible system would be to instead create a metadata file (xml, yaml, etc.) which contains information on the files in the backup. This would allow the logic for moving files and other activities in the backup restore process to be maintained in our codebase in an operating system agnostic way. Because the existing script is not dependent on geode code, old backups would not be affected by this change, though the process for restoring new backups would (likely using gfsh instead of sh or bat scripts). 3) Improved incremental backups: incremental backup allows for significant space savings and is much quicker to run. However, it suffers from the problem that you can only restore to the latest time the incremental backup was run, as we overwrite user files, cache xml and properties, among other files in the backup directory. By saving this information to timestamped directories, restoring to a specific time point would be as simple as choosing the newest point in the backup to include in the restore. Using timestamped directories for normal backups as well would prevent successive backups from overwriting each other.
Re: [DISCUSS] Improvements to backups
Dan, you are correct on #3: there is one location where this appears to not be the case, but it is unused and thus timestamped directories is currently implemented and overwrites should not be possible. This therefore also covers incremental backups and negates the need for change #3. However, what this means is that incremental backups need to know the timestamped directory of the last backup. This suggests a different potential optimization: keeping the (timestamped) incremental backup dirs in a base directory and either using a metadata file or the timestamps from directory names to determine the last incremental backup and automatically using that as the baseline for the current backup (instead of having to (manually) know what that directory was from the previous backup to use in the current backup command) On Thu, Aug 10, 2017 at 10:37 AM, Dan Smith wrote: > +1 this all looks good to me. I think #2 in particular would probably > simplify the incremental backup code. > > For #3, I could have sworn the backups were already going into timestamped > directories and nothing got overwritten in an existing backup. If that is > not already happening that definitely should change! > > -Dan > > On Thu, Aug 10, 2017 at 10:31 AM, Nick Reich wrote: > > > There is a desire to improve backup creation and restore. The suggested > > improvements are listed below and I am seeking feedback from the > community: > > > > 1) Allow saving of backups to different locations/systems: currently, > > backups are saved to a directory on each member. Users can manually or > > through scripting move those backups elsewhere, but it would be > > advantageous to allow direct backups to cloud storage providers (amazon, > > google, azure, etc.) and possibly other systems. To make this possible, > it > > is proposed to refactor backups into a service style architecture with > > backup location plugins that can be used to specify the target location. > > This would allow creation of additional backup strategies as demand is > > determined and allow users to create their own plugins for their own > > special use cases. > > > > 2) Changing backup restore procedure: backups create a restore script per > > member that must be run from each member to restore a backup to. The > script > > created is based on the OS of the machine the backup is created on (it > > mainly moves files to the correct directories). A more flexible system > > would be to instead create a metadata file (xml, yaml, etc.) which > contains > > information on the files in the backup. This would allow the logic for > > moving files and other activities in the backup restore process to be > > maintained in our codebase in an operating system agnostic way. Because > the > > existing script is not dependent on geode code, old backups would not be > > affected by this change, though the process for restoring new backups > would > > (likely using gfsh instead of sh or bat scripts). > > > > 3) Improved incremental backups: incremental backup allows for > significant > > space savings and is much quicker to run. However, it suffers from the > > problem that you can only restore to the latest time the incremental > backup > > was run, as we overwrite user files, cache xml and properties, among > other > > files in the backup directory. By saving this information to timestamped > > directories, restoring to a specific time point would be as simple as > > choosing the newest point in the backup to include in the restore. Using > > timestamped directories for normal backups as well would prevent > successive > > backups from overwriting each other. > > >
Re: ParallelSnapshotFileMapper broke AnalyzeSerializablesJUnitTest
I have created GEODE-3435 to fix the issue. On Fri, Aug 11, 2017 at 3:41 PM, Kirk Lund wrote: > Looks like it was this commit... > > commit 7072f8ef7b764d1507e26cca8ed0c4d184ccc81a > Author: Nick Reich > Date: Fri Jul 28 16:47:10 2017 -0700 > > GEODE-3300: Complete and expose parallel export feature for use > > This closes #704 > > > On Fri, Aug 11, 2017 at 3:40 PM, Kirk Lund wrote: > > > Looks like one of the recent commits to develop resulted in this failure > > involving org/apache/geode/internal/cache/snapshot/ > > ParallelSnapshotFileMapper... > > > > :geode-core:integrationTest > > > > org.apache.geode.codeAnalysis.AnalyzeSerializablesJUnitTest > > > testSerializables FAILED > > java.lang.AssertionError: New or moved classes--- > > - > > org/apache/geode/internal/cache/snapshot/ParallelSnapshotFileMapper, > > true,1 > > > > > > If the class is not persisted or sent over the wire add it to the > file > > /tmp/build/ae3c03f4/gemfire/open/geode-core/src/test/ > > resources/org/apache/geode/codeAnalysis/excludedClasses.txt > > Otherwise if this doesn't break backward compatibility, copy the file > > /tmp/build/ae3c03f4/geode/geode-core/build/integrationTest/ > actualSerializables.dat > > to > > /tmp/build/ae3c03f4/gemfire/open/geode-core/src/test/ > > resources/org/apache/geode/codeAnalysis/sanctionedSerializables.txt. > > at org.apache.geode.codeAnalysis.AnalyzeSerializablesJUnitTest. > > testSerializables(AnalyzeSerializablesJUnitTest.java:150) > > > > 3711 tests completed, 1 failed, 140 skipped > > :geode-core:integrationTest FAILED > > >
Re: Review Request 61676: Release the lock held when beforeCompletion failed with CommitConflictException
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/61676/#review183009 --- Ship it! Ship It! - Nick Reich On Aug. 15, 2017, 10:47 p.m., Eric Shu wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/61676/ > --- > > (Updated Aug. 15, 2017, 10:47 p.m.) > > > Review request for geode, anilkumar gingade, Darrel Schneider, Lynn Gallinat, > and Nick Reich. > > > Bugs: GEODE-3444 > https://issues.apache.org/jira/browse/GEODE-3444 > > > Repository: geode > > > Description > --- > > When JTA transaction failed with exception, it is better to release the locks > it held right away. > > > Diffs > - > > geode-core/src/main/java/org/apache/geode/internal/cache/TXState.java > 55415e3 > > > Diff: https://reviews.apache.org/r/61676/diff/1/ > > > Testing > --- > > Precheckin. > > > Thanks, > > Eric Shu > >
Re: Review Request 61676: Release the lock held when beforeCompletion failed with CommitConflictException
> On Aug. 16, 2017, 10:47 p.m., anilkumar gingade wrote: > > geode-core/src/main/java/org/apache/geode/internal/cache/TXState.java > > Line 1024 (original), 1024 (patched) > > <https://reviews.apache.org/r/61676/diff/1/?file=1798418#file1798418line1024> > > > > And remove this one too... Yes, in both cases the wrapped exception thrown is getting caught by that outer catch block. - Nick --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/61676/#review183078 --- On Aug. 15, 2017, 10:47 p.m., Eric Shu wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/61676/ > --- > > (Updated Aug. 15, 2017, 10:47 p.m.) > > > Review request for geode, anilkumar gingade, Darrel Schneider, Lynn Gallinat, > and Nick Reich. > > > Bugs: GEODE-3444 > https://issues.apache.org/jira/browse/GEODE-3444 > > > Repository: geode > > > Description > --- > > When JTA transaction failed with exception, it is better to release the locks > it held right away. > > > Diffs > - > > geode-core/src/main/java/org/apache/geode/internal/cache/TXState.java > 55415e3 > > > Diff: https://reviews.apache.org/r/61676/diff/1/ > > > Testing > --- > > Precheckin. > > > Thanks, > > Eric Shu > >
Re: Review Request 61722: GEODE-3047 Atomic creation flag is not set if the region is recoverd from the disk.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/61722/#review183187 --- Ship it! Ship It! - Nick Reich On Aug. 17, 2017, 11:20 p.m., anilkumar gingade wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/61722/ > --- > > (Updated Aug. 17, 2017, 11:20 p.m.) > > > Review request for geode, Darrel Schneider, Eric Shu, Lynn Gallinat, and Nick > Reich. > > > Repository: geode > > > Description > --- > > While creating bucket region, to satisfy the redudndancy copies the bucket > regions > are created on all vailable nodes. The initialization (setting persistentIDs) > of > these buckets are done after creating buckets on all the nodes. This > introduced > a race condition for the bucket region which are recovered from the disk to > exchange thier old id with the peer node resulting in > ConflictingPersistentData > Exception. > > The changes are done so that the regions persistent ids are set as soon as > they > are created/initialized. > > > Diffs > - > > > geode-core/src/main/java/org/apache/geode/internal/cache/BucketPersistenceAdvisor.java > 367bb75e9 > > geode-core/src/test/java/org/apache/geode/internal/cache/BucketPersistenceAdvisorTest.java > PRE-CREATION > > > Diff: https://reviews.apache.org/r/61722/diff/1/ > > > Testing > --- > > added new unit test. > run PartitionedRegionSingleHopDUnitTest.testClientMetadataForPersistentPrs > 1400 times > > > Thanks, > > anilkumar gingade > >
Adding parallel import/export of snapshots to gfsh
Team, I am working on exposing the parallel export/import of snapshots through gfsh and would appreciate input on the best approach to adding to / updating the existing interface. Currently, ExportDataCommand and ImportDataCommand take a region name, a member to run the command on, and a file location (that must end in .gfd). Parallel import and export require a directory location instead of a single file name (as there can be multiple files and need for uniquely named files). It is possible to add a parallel flag and have the meaning of the "file" parameter be different depending on that flag, but that seems overly confusing to me. I am instead leaning towards creating new commands (e.g. ParallelExportDataCommand) that has a "directory" parameter to replace "file", but is otherwise identical in usage to the existing commands. Do others have different views or approaches to suggest?
Re: Adding parallel import/export of snapshots to gfsh
I thought about a mutually exclusive —file and —dir, but in that case, -—file is required for standard and —path required for parallel export, which I think could be better than overloading —file, but still has potential for confusion. The idea of deprecating —file in favor of path is interesting. I wonder if instead of making them mutually exclusive to start, having —path be able to support both modes from the start would be better? That way —file could still be used for the existing mode, but —path could be used instead (and override —file is both given?): that would provide a clear path forward for how the command should be used, while fully supporting existing workflows. We need to continue to support both modes, as only Partitioned Regions can make use of parallel export (it is parallelized on a bucket level). On Tue, Aug 22, 2017 at 12:55 PM, Jacob Barrett wrote: > How about deprecate —file and replace with —path? In the transition make > them mutually exclusive and —path required for —parallel. > > Any reason to not just make all export parallel rather than supporting two > different modes? > > -Jake > > > Sent from my iPhone > > > On Aug 22, 2017, at 12:27 PM, Kenneth Howe wrote: > > > > I agrees that overloading the “file” option seems like a bad idea. As an > alternative to separate commands, what about mutually exclusive options, > ‘—file’ and ‘—dir’? > > > > If you go for implementing the new functionality as a separate command, > I would suggest calling the gfsh commands: “export data-parallel” and > “import data-parallel" > > > >> On Aug 22, 2017, at 11:32 AM, Nick Reich wrote: > >> > >> Team, > >> > >> I am working on exposing the parallel export/import of snapshots through > >> gfsh and would appreciate input on the best approach to adding to / > >> updating the existing interface. > >> > >> Currently, ExportDataCommand and ImportDataCommand take a region name, a > >> member to run the command on, and a file location (that must end in > .gfd). > >> Parallel import and export require a directory location instead of a > single > >> file name (as there can be multiple files and need for uniquely named > >> files). It is possible to add a parallel flag and have the meaning of > the > >> "file" parameter be different depending on that flag, but that seems > overly > >> confusing to me. I am instead leaning towards creating new commands > (e.g. > >> ParallelExportDataCommand) that has a "directory" parameter to replace > >> "file", but is otherwise identical in usage to the existing commands. > >> > >> Do others have different views or approaches to suggest? > > >
Re: Adding parallel import/export of snapshots to gfsh
Parallel export will write the data to files on the bucket primary for each bucket, distributing the work (and therefore files) to all the members. That would be a big enough deviation from the current behavior (single file on single machine), that I think it makes it worth having the additional options (but I agree: less options is generally better). On Tue, Aug 22, 2017 at 1:59 PM, Jacob Barrett wrote: > On Tue, Aug 22, 2017 at 1:49 PM Nick Reich wrote: > > > The idea of deprecating —file in favor of path is interesting. I wonder > if > > instead of making them mutually exclusive to start, having —path be able > to > > support both modes from the start would be better? That way —file could > > still be used for the existing mode, but —path could be used instead (and > > override —file is both given?): that would provide a clear path forward > for > > how the command should be used, while fully supporting existing > workflows. > > > > This is what I meant by deprecating. Maybe even providing a message that if > --file is set that it is deprecated for --path. > > > > We need to continue to support both modes, as only Partitioned Regions > can > > make use of parallel export (it is parallelized on a bucket level). > > > > Ok, so why not just make parallel the only mode for partitioned. Then you > remove the need for --parallel and --path would work for any region, > non-partitioned would create a single file at that path and partitioned > would create several? I am all for less options. ;) > > -Jake >
Re: Adding parallel import/export of snapshots to gfsh
With minimal code change, it is possible to enable the use of —dir for both standard and parallel export/import, allowing —file to function only for standard exports (and optionally, make it depricated in favor of the —dir option). While not inherently opposed to forcing all Partitioned Region snapshots to be parallel, that seems to be a significant enough change to current functionality (one file one one member to multiple files on multiple members), I am hesitant to do so without united community backing for that approach. On Tue, Aug 22, 2017 at 2:24 PM, Michael Stolz wrote: > One other idea that hasn't been mentioned is making parallel the only way > for Partitioned Regions, and having --file serve the purpose of defining > both a path and a filename pattern where the bucket ID or whatever we're > using gets automatically inserted before the .gfd extension. > > No need for a new option (--parallel). > No need for a new option (--path). > > In fact, no need for a change to gfsh command at all. > > > -- > Mike Stolz > Principal Engineer, GemFire Product Manager > Mobile: +1-631-835-4771 > > On Tue, Aug 22, 2017 at 2:15 PM, Nick Reich wrote: > > > Parallel export will write the data to files on the bucket primary for > each > > bucket, distributing the work (and therefore files) to all the members. > > That would be a big enough deviation from the current behavior (single > file > > on single machine), that I think it makes it worth having the additional > > options (but I agree: less options is generally better). > > > > On Tue, Aug 22, 2017 at 1:59 PM, Jacob Barrett > > wrote: > > > > > On Tue, Aug 22, 2017 at 1:49 PM Nick Reich wrote: > > > > > > > The idea of deprecating —file in favor of path is interesting. I > wonder > > > if > > > > instead of making them mutually exclusive to start, having —path be > > able > > > to > > > > support both modes from the start would be better? That way —file > could > > > > still be used for the existing mode, but —path could be used instead > > (and > > > > override —file is both given?): that would provide a clear path > forward > > > for > > > > how the command should be used, while fully supporting existing > > > workflows. > > > > > > > > > > This is what I meant by deprecating. Maybe even providing a message > that > > if > > > --file is set that it is deprecated for --path. > > > > > > > > > > We need to continue to support both modes, as only Partitioned > Regions > > > can > > > > make use of parallel export (it is parallelized on a bucket level). > > > > > > > > > > Ok, so why not just make parallel the only mode for partitioned. Then > you > > > remove the need for --parallel and --path would work for any region, > > > non-partitioned would create a single file at that path and partitioned > > > would create several? I am all for less options. ;) > > > > > > -Jake > > > > > >
Re: Review Request 61895: GEDOE-3516: TXManagerImpl.tryResume call may add itself again into the waiting thread queue
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/61895/#review183945 --- geode-core/src/test/java/org/apache/geode/internal/cache/TXManagerImplJUnitTest.java Lines 380 (patched) <https://reviews.apache.org/r/61895/#comment259991> What does this Thread.sleep() do, if the next line is awaiting a latch? geode-core/src/test/java/org/apache/geode/internal/cache/TXManagerImplJUnitTest.java Lines 384 (patched) <https://reviews.apache.org/r/61895/#comment259995> Is spyMgr.suspend() here being attempted to occur between when the two threads start and before they complete? If so, is that guaranteed to occur? - Nick Reich On Aug. 25, 2017, 4:53 p.m., Eric Shu wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/61895/ > --- > > (Updated Aug. 25, 2017, 4:53 p.m.) > > > Review request for geode, anilkumar gingade, Darrel Schneider, Lynn Gallinat, > and Nick Reich. > > > Bugs: GEODE-3516 > https://issues.apache.org/jira/browse/GEODE-3516 > > > Repository: geode > > > Description > --- > > Remove the thread from waiting thread queue after successfully resumed the > transaction > > > Diffs > - > > geode-core/src/main/java/org/apache/geode/internal/cache/TXManagerImpl.java > a0a4d7c > > geode-core/src/test/java/org/apache/geode/internal/cache/TXManagerImplJUnitTest.java > a2c1e70 > > > Diff: https://reviews.apache.org/r/61895/diff/2/ > > > Testing > --- > > precheckin. > > > Thanks, > > Eric Shu > >
Re: Review Request 61895: GEDOE-3516: TXManagerImpl.tryResume call may add itself again into the waiting thread queue
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/61895/#review184710 --- Ship it! Ship It! - Nick Reich On Sept. 6, 2017, 6:42 p.m., Eric Shu wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/61895/ > --- > > (Updated Sept. 6, 2017, 6:42 p.m.) > > > Review request for geode, anilkumar gingade, Darrel Schneider, Lynn Gallinat, > and Nick Reich. > > > Bugs: GEODE-3516 > https://issues.apache.org/jira/browse/GEODE-3516 > > > Repository: geode > > > Description > --- > > Remove the thread from waiting thread queue after successfully resumed the > transaction > > > Diffs > - > > geode-core/src/main/java/org/apache/geode/internal/cache/TXManagerImpl.java > a0a4d7c > > geode-core/src/test/java/org/apache/geode/internal/cache/TXManagerImplJUnitTest.java > a2c1e70 > > > Diff: https://reviews.apache.org/r/61895/diff/3/ > > > Testing > --- > > precheckin. > > > Thanks, > > Eric Shu > >
Re: Review Request 62164: GEODE-3566 Moving a bucket during rebalancing does not update overflow stats
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/62164/#review184842 --- geode-core/src/main/java/org/apache/geode/internal/cache/BucketRegion.java Lines 2460 (patched) <https://reviews.apache.org/r/62164/#comment261032> Can remove nesting by checking dr != null && destroyDiskRegion geode-core/src/test/java/org/apache/geode/internal/cache/partitioned/BucketRebalanceStatRegressionTest.java Lines 91 (patched) <https://reviews.apache.org/r/62164/#comment261064> It does not look like the second part of this method uses the shortcut or overflow parameters, so it could be a seperate method. Since this method does two different things, that would probably be optimal. - Nick Reich On Sept. 7, 2017, 4:56 p.m., Lynn Gallinat wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/62164/ > --- > > (Updated Sept. 7, 2017, 4:56 p.m.) > > > Review request for geode, anilkumar gingade, Darrel Schneider, Eric Shu, and > Nick Reich. > > > Repository: geode > > > Description > --- > > We now update the stats in a member that lost a bucket due to rebalancing > > > Diffs > - > > geode-core/src/main/java/org/apache/geode/internal/cache/BucketRegion.java > 465a3dd > > geode-core/src/test/java/org/apache/geode/internal/cache/partitioned/BucketRebalanceStatRegressionTest.java > PRE-CREATION > > > Diff: https://reviews.apache.org/r/62164/diff/1/ > > > Testing > --- > > Precheckin > > > Thanks, > > Lynn Gallinat > >
Re: Review Request 62164: GEODE-3566 Moving a bucket during rebalancing does not update overflow stats
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/62164/#review184872 --- geode-core/src/test/java/org/apache/geode/internal/cache/partitioned/BucketRebalanceStatRegressionTest.java Lines 17 (patched) <https://reviews.apache.org/r/62164/#comment261066> I think this is the wrong library, or are you using assertj on purpose instead of junit? geode-core/src/test/java/org/apache/geode/internal/cache/partitioned/BucketRebalanceStatRegressionTest.java Lines 128 (patched) <https://reviews.apache.org/r/62164/#comment261065> This could be cleaned up by getting the region once at the top of the method and using that variable throughout. geode-core/src/test/java/org/apache/geode/internal/cache/partitioned/BucketRebalanceStatRegressionTest.java Lines 129 (patched) <https://reviews.apache.org/r/62164/#comment261069> This method seems to be doing two distinct things: the first is a vm specific check of region entry and bucket count sizes (which could be its own method that takes only a single vm) and the second is checking overflow, which could also be its own method (especially since only one region is checked). geode-core/src/test/java/org/apache/geode/internal/cache/partitioned/BucketRebalanceStatRegressionTest.java Lines 156 (patched) <https://reviews.apache.org/r/62164/#comment261070> Normalize variable names to camel case? geode-core/src/test/java/org/apache/geode/internal/cache/partitioned/BucketRebalanceStatRegressionTest.java Lines 249 (patched) <https://reviews.apache.org/r/62164/#comment261067> use TEST_REGION constant - Nick Reich On Sept. 7, 2017, 4:56 p.m., Lynn Gallinat wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/62164/ > --- > > (Updated Sept. 7, 2017, 4:56 p.m.) > > > Review request for geode, anilkumar gingade, Darrel Schneider, Eric Shu, and > Nick Reich. > > > Repository: geode > > > Description > --- > > We now update the stats in a member that lost a bucket due to rebalancing > > > Diffs > - > > geode-core/src/main/java/org/apache/geode/internal/cache/BucketRegion.java > 465a3dd > > geode-core/src/test/java/org/apache/geode/internal/cache/partitioned/BucketRebalanceStatRegressionTest.java > PRE-CREATION > > > Diff: https://reviews.apache.org/r/62164/diff/1/ > > > Testing > --- > > Precheckin > > > Thanks, > > Lynn Gallinat > >
Re: Review Request 62164: GEODE-3566 Moving a bucket during rebalancing does not update overflow stats
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/62164/#review184982 --- Ship it! Ship It! - Nick Reich On Sept. 8, 2017, 4:10 p.m., Lynn Gallinat wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/62164/ > --- > > (Updated Sept. 8, 2017, 4:10 p.m.) > > > Review request for geode, anilkumar gingade, Darrel Schneider, Eric Shu, and > Nick Reich. > > > Repository: geode > > > Description > --- > > We now update the stats in a member that lost a bucket due to rebalancing > > > Diffs > - > > geode-core/src/main/java/org/apache/geode/internal/cache/BucketRegion.java > 465a3dd > > geode-core/src/test/java/org/apache/geode/internal/cache/partitioned/BucketRebalanceStatRegressionTest.java > PRE-CREATION > > > Diff: https://reviews.apache.org/r/62164/diff/2/ > > > Testing > --- > > Precheckin > > > Thanks, > > Lynn Gallinat > >
Re: Review Request 62164: GEODE-3566 Moving a bucket during rebalancing does not update overflow stats
> On Sept. 7, 2017, 8:08 p.m., Nick Reich wrote: > > geode-core/src/test/java/org/apache/geode/internal/cache/partitioned/BucketRebalanceStatRegressionTest.java > > Lines 17 (patched) > > <https://reviews.apache.org/r/62164/diff/1/?file=1817801#file1817801line17> > > > > I think this is the wrong library, or are you using assertj on purpose > > instead of junit? > > Lynn Gallinat wrote: > The use of assertj was intentional and recommended to me by Kirk. Ok. I just had not seen any tests yet that used assertj here before, so was unsure if that was a new preferred/acceptable library or not. - Nick --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/62164/#review184872 --- On Sept. 8, 2017, 4:10 p.m., Lynn Gallinat wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/62164/ > --- > > (Updated Sept. 8, 2017, 4:10 p.m.) > > > Review request for geode, anilkumar gingade, Darrel Schneider, Eric Shu, and > Nick Reich. > > > Repository: geode > > > Description > --- > > We now update the stats in a member that lost a bucket due to rebalancing > > > Diffs > - > > geode-core/src/main/java/org/apache/geode/internal/cache/BucketRegion.java > 465a3dd > > geode-core/src/test/java/org/apache/geode/internal/cache/partitioned/BucketRebalanceStatRegressionTest.java > PRE-CREATION > > > Diff: https://reviews.apache.org/r/62164/diff/2/ > > > Testing > --- > > Precheckin > > > Thanks, > > Lynn Gallinat > >
Re: Review Request 62506: GEODE-3679: Original client member Id in the transaction should be set in the PartitionMessage
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/62506/#review186003 --- geode-core/src/test/java/org/apache/geode/internal/cache/ClientServerTransactionDUnitTest.java Lines 4109 (patched) <https://reviews.apache.org/r/62506/#comment262355> This is actually "doTransactionOnServer", as there is nothing about the method which is tied to server1. geode-core/src/test/java/org/apache/geode/internal/cache/ClientServerTransactionDUnitTest.java Lines 4114 (patched) <https://reviews.apache.org/r/62506/#comment262351> Avoid single character variable names geode-core/src/test/java/org/apache/geode/internal/cache/ClientServerTransactionDUnitTest.java Lines 4120 (patched) <https://reviews.apache.org/r/62506/#comment262354> Can this have a generic type like above method? Would remove casting to Integer below. geode-core/src/test/java/org/apache/geode/internal/cache/ClientServerTransactionDUnitTest.java Lines 4125 (patched) <https://reviews.apache.org/r/62506/#comment262353> Do not need to hold onto the set: for (Object key : region.keySet()) - Nick Reich On Sept. 22, 2017, 5:01 p.m., Eric Shu wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/62506/ > --- > > (Updated Sept. 22, 2017, 5:01 p.m.) > > > Review request for geode, anilkumar gingade, Darrel Schneider, Kirk Lund, > Lynn Gallinat, and Nick Reich. > > > Bugs: GEODE-3679 > https://issues.apache.org/jira/browse/GEODE-3679 > > > Repository: geode > > > Description > --- > > Sets the originating member id from client transaction in partition message > when a server needs to perform operaton and send to other peers. > > > Diffs > - > > > geode-core/src/main/java/org/apache/geode/internal/cache/partitioned/PartitionMessage.java > 8c27107 > > geode-core/src/test/java/org/apache/geode/internal/cache/ClientServerTransactionDUnitTest.java > 96b89b9 > > > Diff: https://reviews.apache.org/r/62506/diff/1/ > > > Testing > --- > > precheckin. > > > Thanks, > > Eric Shu > >
Re: Rebase and squash before merging PRs
Here are the docs from github: https://help.github.com/articles/about-pull-request-merges/ Based on those and using squash and commit for some of my merges, it looks like it does what we want: just one commit for the merge of the feature branch. Note that "rebase and merge" in github does not actually work exactly like it does in git (see above link). On Thu, Oct 5, 2017 at 4:15 PM, Jared Stewart wrote: > Does anyone happen to know if “squash and merge” also does a rebase or > not? I’ve been hesitant to use that button since I’m not sure what exact > sequence of git commands it corresponds to. > > > On Oct 5, 2017, at 3:59 PM, Jason Huynh wrote: > > > > I think we can also use "squash and merge" if wanting to squash commits > > before merging. This would allow you not to have to force push every > time. > > > > On Thu, Oct 5, 2017 at 3:15 PM Jinmei Liao wrote: > > > >> On the PR UI page, you can do that by pull down the the menu when you > are > >> ready to merge. Remember to use "Rebase and merge". > >> > >> > >> > >> > >> Not sure if this is useful to everyone, but when I push a subsequent > commit to my feature branch, I always use "force push", so that it's only > one commit I need to rebase to develop. > >> > >> > >> On Thu, Oct 5, 2017 at 3:00 PM, Jared Stewart > wrote: > >> > >>> I’ve been seeing a lot more merge commits on develop since we moved to > >>> Gitbox. Just wanted to give everyone a friendly reminder to please > rebase > >>> before merging to keep our git history tidy and readable. > >>> > >>> Thanks, > >>> Jared > >> > >> > >> > >> > >> -- > >> Cheers > >> > >> Jinmei > >> > >
Re: [Discuss] CliStrings
+1 for moving those messages out of CliStrings if at all possible and placing them where they are used. In my experiences with those strings, I have rarely if ever seen them reused across classes, so they really should belong in the class they are used by. On Thu, Oct 19, 2017 at 3:05 PM, Jared Stewart wrote: > I wanted to kick off a discussion about the usage of CliStrings. For > those unfamiliar, it’s a java class that contains about ~3000 lines of > String constants and has a javadoc explaining that it is an attempt at i18n > localization. Does anyone know if this localization is actually > implemented in practice? > > If not, I would like suggest that we try to move away from this pattern > going forward. We have ended up with many constants in CliStrings like > this: > CliStrings.CREATE_REGION__MSG__ONLY_ONE_OF_REGIONSHORTCUT_ > AND_USEATTRIBUESFROM_CAN_BE_SPECIFIED > The constant is only used in CreateRegionCommand, so I would be happier to > see it as a member of CreateRegionCommand (where there would be no need for > the unwieldy "CREATE_REGION__MSG__” prefix) unless there is localization > being done which I am unaware of. > > Thoughts? > > Thanks, > Jared
[DISCUSS] Benchmarks module package structure
Team, I am in the progress of adding new benchmarks to the (currently sparse) geode-benchmarks module. The lack of current examples in the module leads me to wonder what the correct organization of benchmarks in the module is (and if this is the right location). The existing benchmarks are all in org.apache.geode.cache.benchmark. Following this pattern would (presumably) result in benchmark subpackages in each package that has benchmarks. Making the root package org.apache.geode.benchmark would remove this proliferation of sub packages. However, both these approaches have the issue that package level methods/classes cannot be accessed from benchmarks as they will never share a package with the production code. 1) Should benchmarks then not be put in special benchmark packages? 2) Should our benchmarks not invoke package level methods/classes in the case that we should use benchmark packages? Or should such benchmarks not reside in the benchmarks module? 3) Is geode-benchmarks where we intend all benchmarks, only certain classes of benchmarks (all using jmh for example), or would we prefer embedding them in the modules where the code being benchmarked resides? Thanks for your input.
Re: [DISCUSS] Benchmarks module package structure
I think if we can make it work, removing the benchmarks module and moving the benchmarks into the module where the code they are testing resides would be the ideal. Talking with Dan, this would result in a src/jmh/java source directory in any module with benchmarks, which I think would result in a good level of proximity between the benchmarks and the code they are testing. On Mon, Jan 8, 2018 at 3:15 PM, Kirk Lund wrote: > We'll probably end up writing benchmarks for classes or methods that are > package-private, so that would be one reason to not create a special > benchmarks package. > > On Mon, Jan 8, 2018 at 1:40 PM, Dan Smith wrote: > > > I think this module was specifically added for running JMH benchmarks > since > > it's pulling in the JMH plugin. JMH is framework for easily writing > > microbenchmarks. > > > > I think it makes sense to put JMH benchmarks in the same package as the > > code under tests, since you may end up writing a microbenchmark for just > > one in internal class. Arguably, all of the modules could pull in the JMH > > plugin and these benchmarks could sit in those modules. I think the > current > > structure might have just been the easiest way to integrate JMH into the > > build. > > > > -Dan > > > > On Sun, Jan 7, 2018 at 9:59 PM, Xiaojian Zhou wrote: > > > > > The package might be always a problem. Even you put the cq benchmark > code > > > under geode-cq to near its source code, it might still have to access > > code > > > under other package, such as geode-core. > > > > > > So I think put benchmark test code under benchmark package is ok. Your > > > option 2) is good. > > > > > > Regards > > > Gester > > > > > > On Fri, Jan 5, 2018 at 11:57 AM, Nick Reich wrote: > > > > > > > Team, > > > > > > > > I am in the progress of adding new benchmarks to the (currently > sparse) > > > > geode-benchmarks module. The lack of current examples in the module > > leads > > > > me to wonder what the correct organization of benchmarks in the > module > > is > > > > (and if this is the right location). > > > > > > > > The existing benchmarks are all in org.apache.geode.cache.benchmark. > > > > Following this pattern would (presumably) result in benchmark > > subpackages > > > > in each package that has benchmarks. Making the root package > > > > org.apache.geode.benchmark would remove this proliferation of sub > > > packages. > > > > However, both these approaches have the issue that package level > > > > methods/classes cannot be accessed from benchmarks as they will never > > > share > > > > a package with the production code. > > > > > > > > 1) Should benchmarks then not be put in special benchmark packages? > > > > > > > > 2) Should our benchmarks not invoke package level methods/classes in > > the > > > > case that we should use benchmark packages? Or should such benchmarks > > not > > > > reside in the benchmarks module? > > > > > > > > 3) Is geode-benchmarks where we intend all benchmarks, only certain > > > classes > > > > of benchmarks (all using jmh for example), or would we prefer > embedding > > > > them in the modules where the code being benchmarked resides? > > > > > > > > Thanks for your input. > > > > > > > > > >
Re: [PROPOSAL]: Adding declarable support in Gfsh command
This would solve the problem of specifying the parameters for a Declarable, but if you provided support for any valid json, you could cover other situations as well, including those with more complicated and possibly nested configuration. If we would ever support such scenarios in the future, I assume that we would want a generic solution that would cover all configuration. Is this something that we anticipate needing and if so, how would the current proposal cover such situations? On Fri, Jan 26, 2018 at 8:43 AM, Jens Deppe wrote: > This also avoids the other option of implementing this by having associated > 'params' options for each option which can take a Declarable, thus reducing > the proliferation of options - in particular for 'create region'. > > i.e. --cache-listener AND --cache-listener-params. > > Further, this json parameter would not allow for arbitrary json but would > be restricted to a simple key/value mapping so that there would be a direct > translation to Declarable parameters in the cache.xml. > > --Jens > > On Fri, Jan 26, 2018 at 8:07 AM, Jinmei Liao wrote: > > > Currently, when you want to specify a call-back in gfsh command option, > you > > can only pass in the class name, e.g.: > > > > create region --name=regionA --type=PARTITION > --cache-loader=my.CacheLoader > > > > But these callbacks all implements Declarable (CacheLoader, CacheWriter, > > CacheListener, CustomExpiry etc.), i.e they can initialized with extra > > properties that can be set when they are declared in cache.xml, but > > currently, our gfsh command doesn't support that. > > > > We are proposing to add the support to configure Declarables in gfsh > > commands by adding json strings at the end of the class name. like this: > > > > create region --name=regionA --type=PARTITION > > --cache-loader=my.CacheLoader?{"key":"value,"key2":"value2"} > > > > (of course, if you don't need to configure your Declarable, you can still > > only specify a className as before). > > > > Comments/thoughts? > > > > -- > > Cheers > > > > Jinmei > > >
Re: [PROPOSAL]: deprecating create region using --template-region option
+1 for deprecating --template-region. It feels like a convenience method that by it very nature has an unobvious result and would require more effort to understand so it could be used correctly by a user than the value that it presents. On Wed, Feb 7, 2018 at 10:26 AM, Anilkumar Gingade wrote: > +1 for deprecating --template-region > > User can always find the command used to create the region and re-use it > (or if its in script, cut and paste the line). > > -Anil. > > > On Wed, Feb 7, 2018 at 9:49 AM, Jinmei Liao wrote: > > > Hi, All, > > > > currently, there are two ways to create a region, either to use a > "--type" > > option indicating a region shortcut or a "--template-region" option > > specifying another regionPath where you want to copy the attribute from. > > > > First of all, we are not sure what set of attributes that make sense to > > copy to the new region. e.g listeners/loaders/writers, normally these are > > connected to a downstream database that user may/may not want the new > > region to read/write the same table. And the current implementation would > > fail to create a region from a template that has these custom callbacks > > (including listeners, loader, writer, compressor, resolvers etc). So we > > would like to understand how useful this command option is and if we can > > stop supporting it? > > > > Suggestions/feedback welcome! > > > > -- > > Cheers > > > > Jinmei > > >
[PROPOSAL]: concurrent bucket moves during rebalance
Team, The time required to undertake a rebalance of a geode cluster has often been an area for improvement noted by users. Currently, buckets are moved one at a time and we propose that creating a system that moved buckets in parallel could greatly improve performance for this feature. Previously, parallelization was implemented for adding redundant copies of buckets to restore redundancy. However, moving buckets is a more complicated matter and requires a different approach than restoration of redundancy. The reason for this is that members could be potentially both be gaining buckets and giving away buckets at the same time. While giving away a bucket, that member still has all of the data for the bucket, until the receiving member has fully received the bucket and it can safely be removed from the original owner. This means that unless the member has the memory overhead to store all of the buckets it will receive and all the buckets it started with, there is potential that parallel moving of buckets could cause the member to run out of memory. For this reason, we propose a system that does (potentially) several rounds of concurrent bucket moves: 1) A set of moves is calculated to improve balance that meet a requirement that no member both receives and gives away a bucket (no member will have memory overhead of an existing bucket it is ultimately removing and a new bucket). 2) Conduct all calculated bucket moves in parallel. Parameters to throttle this process (to prevent taking too many cluster resources, impacting performance) should be added, such as only allowing each member to either receive or send a maximum number of buckets concurrently. 3) If cluster is not yet balanced, perform additional iterations of calculating and conducting bucket moves, until balance is achieved or a possible maximum iterations is reached. Note: in both the existing and proposed system, regions are rebalanced one at a time. Please let us know if you have feedback on this approach or additional ideas that should be considered.
Re: [PROPOSAL]: concurrent bucket moves during rebalance
Mike, I think having a good default value for maximum parallel operations will play a role in not consuming too many resources. Perhaps defaulting to only a single (or other small number based on testing) parallel action(s) per member at a time and allowing users that want better performance to increase that number would be a good start. That should result in performance improvements, but not place increased burden on any specific member. Especially when bootstrapping new members, relance speed may be more valuable than usual, so making it possible to configure on a per rebalance action level would be prefered. One clarification from my original proposal: regions can already be rebalanced in parallel, depending on the value of resource.manager.threads (which defaults to 1, so no parallelization or regions in the default case). On Thu, Mar 8, 2018 at 11:46 AM, Michael Stolz wrote: > We should be very careful about how much resource we dedicate to > rebalancing. > > One of our competitors rebalances *much* faster than we do, but in doing so > they consume all available resources. > > At one bank that caused significant loss of incoming market data that was > coming in on a multicast feed, which had a severe adverse effect on the > pricing and risk management functions for a period of time. That bank > removed the competitor's product and for several years no distributed > caching was allowed by the chief architect at that bank. Until he left and > a new chief architect was named they didn't use any distributed caching > products. When they DID go back to using them, it pre-dated Geode, so they > used GemFire largely because GemFire does not consume all available > resources while rebalancing. > > I do think we need to improve our rebalancing such that it iterates until > it achieves balance, but not in a way that will consume all available > resources. > > -- > Mike Stolz > > > On Thu, Mar 8, 2018 at 2:25 PM, Nick Reich wrote: > > > Team, > > > > The time required to undertake a rebalance of a geode cluster has often > > been an area for improvement noted by users. Currently, buckets are moved > > one at a time and we propose that creating a system that moved buckets in > > parallel could greatly improve performance for this feature. > > > > Previously, parallelization was implemented for adding redundant copies > of > > buckets to restore redundancy. However, moving buckets is a more > > complicated matter and requires a different approach than restoration of > > redundancy. The reason for this is that members could be potentially both > > be gaining buckets and giving away buckets at the same time. While giving > > away a bucket, that member still has all of the data for the bucket, > until > > the receiving member has fully received the bucket and it can safely be > > removed from the original owner. This means that unless the member has > the > > memory overhead to store all of the buckets it will receive and all the > > buckets it started with, there is potential that parallel moving of > buckets > > could cause the member to run out of memory. > > > > For this reason, we propose a system that does (potentially) several > rounds > > of concurrent bucket moves: > > 1) A set of moves is calculated to improve balance that meet a > requirement > > that no member both receives and gives away a bucket (no member will have > > memory overhead of an existing bucket it is ultimately removing and a new > > bucket). > > 2) Conduct all calculated bucket moves in parallel. Parameters to > throttle > > this process (to prevent taking too many cluster resources, impacting > > performance) should be added, such as only allowing each member to either > > receive or send a maximum number of buckets concurrently. > > 3) If cluster is not yet balanced, perform additional iterations of > > calculating and conducting bucket moves, until balance is achieved or a > > possible maximum iterations is reached. > > Note: in both the existing and proposed system, regions are rebalanced > one > > at a time. > > > > Please let us know if you have feedback on this approach or additional > > ideas that should be considered. > > >
Re: [PROPOSAL]: concurrent bucket moves during rebalance
Swapnil, Interesting points, here are my thoughts: > Given that there is already support for parallel rebalancing among regions, > I do not see the value in supporting parallel rebalancing of buckets. I think there are some advantages to parallel rebalance of a region over rebalancing regions in parallel. The first is that the existing system does not help users with a single region that is significantly larger than others. The second is that the existing system could cause memory overage if several regions are rebalanced in parallel, since they are not coordinated and would suffer from the issue discussed in the proposal about members receiving and giving up buckets at the same time. The third is that specifying the number of regions to rebalance does not spread the load evenly across the cluster, but can hotspot specific members (for the reason discussed in the previous point). By specifying the maximum number of bucket transfers a member should concurrently be involved with, you can specifically tune the overhead that rebalance can cause while at the same time maximizing the number of transfers occurring at a time, especially on larger clusters. If we end up doing this anyway, I would suggest to not rely on "parameters" > for throttling, as these parameters would have to be configured in advance > without knowing what the actual load looks like when rebalance is in > progress and hence could be difficult to get right. It would be ideal if we > can handle this using back pressure. > Any back pressure mechanism still requires parameterization on what the maximum resources are. For example, back pressure on adding tasks to a thread pool executor is based on the number of allowed threads and the maximum backlog. Based on Mike's anecdote, users will likely have their of definition of too much resources, such as the hit they take to either throughput or latency during the rebalance. I think the right approach is to provide a good (but conservative) default and let users increase the resource usage until they reach their optimal situation. This means we should allow the level of parallelization to be configurable at runtime and likely also on an individual rebalance (when kicked off manually). I am interested in ideas on how best to parameterize for throttling and making it as easy as possible for users to understand, configure, and predict performance tradeoffs. On Fri, Mar 9, 2018 at 6:52 AM, Anthony Baker wrote: > It would be interesting to have a way to model the behavior of the current > algorithm and compare it to the proposal under various conditions like > membership changes, data imbalance, etc. That would let us understand the > optimality of the change in a concrete way. > > Anthony > > > On Mar 9, 2018, at 1:19 AM, Swapnil Bawaskar > wrote: > > > > Given that there is already support for parallel rebalancing among > regions, > > I do not see the value in supporting parallel rebalancing of buckets. > > > > If we end up doing this anyway, I would suggest to not rely on > "parameters" > > for throttling, as these parameters would have to be configured in > advance > > without knowing what the actual load looks like when rebalance is in > > progress and hence could be difficult to get right. It would be ideal if > we > > can handle this using back pressure. > > > >> On Thu, Mar 8, 2018 at 12:05 PM Nick Reich wrote: > >> > >> Mike, > >> > >> I think having a good default value for maximum parallel operations will > >> play a role in not consuming too many resources. Perhaps defaulting to > only > >> a single (or other small number based on testing) parallel action(s) per > >> member at a time and allowing users that want better performance to > >> increase that number would be a good start. That should result in > >> performance improvements, but not place increased burden on any specific > >> member. Especially when bootstrapping new members, relance speed may be > >> more valuable than usual, so making it possible to configure on a per > >> rebalance action level would be prefered. > >> > >> One clarification from my original proposal: regions can already be > >> rebalanced in parallel, depending on the value of > resource.manager.threads > >> (which defaults to 1, so no parallelization or regions in the default > >> case). > >> > >>> On Thu, Mar 8, 2018 at 11:46 AM, Michael Stolz > wrote: > >>> > >>> We should be very careful about how much resource we dedicate to > >>> rebalancing. > >>> > >>> One of our competitors rebalances *much* faster than we do, but in > doing > >> so > &
Re: [DISCUSS] New List for Commit and CI Emails
+1 I use an email filter (from:apachegeod...@gmail.com) to move all the automated CI messages to a different section of my inbox, but having a fully distinct email list, if possible, would be even nicer. On Tue, Mar 20, 2018 at 2:27 PM, Galen O'Sullivan wrote: > Hi all, > > I'm wondering if it would be possible to have a separate list for commit > announcements, CI emails and similar automated messages, and keep this > mailing list for discussion. It would make it easier to filter email, and I > think it might help improve the discussion. I know I've missed messages > sent to this list in the flood of email. What do you think? Who's in charge > of lists and gets to make those decisions? > > Thanks, > Galen >
[DISCUSS]: Tests requiring external services
Team, As part of validating the new JDBC connector for Geode, we have a need for tests that involving connecting to specific databases (like MySQL and Postgres) to validate proper function with those databases. Since these tests require connecting to outside services, unlike existing Geode tests, we are seeking suggestions on how to best incorporate such tests into Geode. The approach we have taken so far is to utilize Docker (and Docker Compose) to take care of spinning up our external services for the duration of the tests. This, however requires that Docker and Docker Compose be installed on any machine that the tests are run on. Additionally, the Concourse pipeline for validating develop is incompatible with use of Docker for distributed tests, due to the way they are already being run within Docker containers of their own (it seems possible to make it work, but would add overhead to all tests and would be a challenge to implement). To address these issues, we are considering having these tests run under a new task, such as "serviceTest" (instead of IntegrationTest or distributedTest). That way, developers could run all other tests normally on their machines, only requiring Docker and Docker Compose if they wish to run these specific tests. This would also allow them to be their own task in Concourse, eliminating the issues that plague integrating these tests there. Are there other ideas on how to manage these tests or concerns with the proposed approach?
Re: [DISCUSS]: Tests requiring external services
Using AcceptanceTest category seems like a good solution at the moment to me. On Tue, Apr 3, 2018 at 4:29 PM, Sean Goller wrote: > I'm actually fine with putting it in AcceptanceTest for now. > > Ideally I'd like to see something like JDBC connection strings that could > be passed in as properties via the command-line, and if they're not present > the relevant tests don't get run. That way the entity running the tests can > decide the best way to enable those tests. > > On Tue, Apr 3, 2018 at 4:11 PM, Jens Deppe wrote: > > > I'm in favor of using docker for test isolation. We already have an > > 'AcceptanceTest' category which you might consider using instead of > > creating yet another category. > > > > --Jens > > > > On Tue, Apr 3, 2018 at 2:02 PM, Nick Reich wrote: > > > > > Team, > > > > > > As part of validating the new JDBC connector for Geode, we have a need > > for > > > tests that involving connecting to specific databases (like MySQL and > > > Postgres) to validate proper function with those databases. Since these > > > tests require connecting to outside services, unlike existing Geode > > tests, > > > we are seeking suggestions on how to best incorporate such tests into > > > Geode. The approach we have taken so far is to utilize Docker (and > Docker > > > Compose) to take care of spinning up our external services for the > > duration > > > of the tests. This, however requires that Docker and Docker Compose be > > > installed on any machine that the tests are run on. Additionally, the > > > Concourse pipeline for validating develop is incompatible with use of > > > Docker for distributed tests, due to the way they are already being run > > > within Docker containers of their own (it seems possible to make it > work, > > > but would add overhead to all tests and would be a challenge to > > implement). > > > > > > To address these issues, we are considering having these tests run > under > > a > > > new task, such as "serviceTest" (instead of IntegrationTest or > > > distributedTest). That way, developers could run all other tests > normally > > > on their machines, only requiring Docker and Docker Compose if they > wish > > to > > > run these specific tests. This would also allow them to be their own > task > > > in Concourse, eliminating the issues that plague integrating these > tests > > > there. > > > > > > Are there other ideas on how to manage these tests or concerns with the > > > proposed approach? > > > > > >