[
https://issues.apache.org/jira/browse/GEODE-3448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16133470#comment-16133470
]
ASF GitHub Bot commented on GEODE-3448:
---------------------------------------
GitHub user nreich opened a pull request:
https://github.com/apache/geode/pull/721
GEODE-3448: Implement and expose parallel snapshot import
Tests showed that allowing the parallel import of snapshot files (from
parallel export) scales nearly linearly, greatly increasing performance when
importing snapshots as cluster size increases. Using it requires specifying a
directory, which must be the same on all nodes that have files that need
importing. All snapshot .gfd files in those directories will be loaded into the
cluster.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/nreich/geode feature/GEODE-3448
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/geode/pull/721.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #721
----
commit 85d7b54627b59e75c7c08bce1d6feef7e565ea10
Author: Nick Reich <[email protected]>
Date: 2017-08-17T23:29:45Z
GEODE-3448: Implement and expose parallel snapshot import
----
> Complete and expose parallel snapshot importing feature
> -------------------------------------------------------
>
> Key: GEODE-3448
> URL: https://issues.apache.org/jira/browse/GEODE-3448
> Project: Geode
> Issue Type: Sub-task
> Components: snapshot
> Reporter: Nick Reich
> Assignee: Nick Reich
>
> Tests have suggested that parallelizing the importing of a snapshot (using
> the multiple files created in a parallel export) has significant advantages
> as the cluster increased in size. The ability to do this conveniently
> (instead of running an import manually on each member of the cluster) should
> be exposed.
> The implementation will assume that each member of the cluster has the
> specified directory locally available (or at least every member where there
> is data that needs import). However, this means that it will not be possible
> to do a parallel import directly from a shared network disk: if we require
> such functionality, a flag will need to be added to indicate the use of a
> shared location, otherwise, all data will be imported by each member, rather
> than just once.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)