Github user dschneider-pivotal commented on a diff in the pull request:

    https://github.com/apache/geode/pull/559#discussion_r120224399
  
    --- Diff: 
geode-docs/managing/troubleshooting/system_failure_and_recovery.html.md.erb ---
    @@ -276,8 +276,83 @@ find the reason.
     
     Description:
     
    -The process discovered that it was not in the distributed system and 
cannot determine why it was removed. The membership coordinator removed the 
member after it failed to respond to an internal are you alive message.
    +The process discovered that it was not in the distributed system and 
cannot determine why it was
    +removed. The membership coordinator removed the member after it failed to 
respond to an internal 
    +are-you-alive message.
     
     Response:
     
     The operator should examine the locator processes and logs.
    +
    +## <a id="restart-failure-persistent-lru" class="no-quick-link"></a> 
Restart Fails Due To Out-of-Memory Error
    +
    +This section describes a restart failure that can occur when the stopped 
system is one that was configured with persistent regions. Specifically:
    +
    +- Some of the regions of the recovering system, when running, were 
configured as PERSISTENT regions, which means that they save their data to disk.
    +- At least one of the persistent regions was configured to evict least 
recently used (LRU) data by overflowing values to disk.
    +
    +### How Data is Recovered From Persistent Regions
    +
    +Data recovery, upon restart, always recovers keys. You can configure 
whether and how the system
    +recovers the values associated with those keys to populate the system 
cache.
    +
    +**Value Recovery**
    +
    +- Recovering all values immediately during startup slows the startup time 
but results in consistent
    +read performance after the startup on a "hot" cache.
    +
    +- Recovering no values means quicker startup but a "cold" cache, so the 
first retrieval of each value will read from disk.
    +
    +- Retrieving values asynchronously in a background thread allows a 
relatively quick startup on a "warm" cache
    +that will eventually recover every value.
    +
    +**Retrieve or Ignore LRU values**
    +
    +When a system with persistent LRU regions shuts down, the system does not 
record which of the values
    +were recently used. On subsequent startup, if values are recovered into an 
LRU region they may be
    +the least recently used instead of the most recently used. Also, if LRU 
values are recovered on a
    +heap or an off-heap LRU region, it is possible that the LRU memory limit 
will be exceeded, resulting
    +in an `OutOfMemoryException` during recovery. For these reasons, LRU value 
recovery can be treated
    +differently than non-LRU values.
    +
    +## Default Recovery Behavior for Persistent Regions
    +
    +The default behavior is for the system to recover all keys, then 
asynchronously recover all data
    +values that were resident, leaving LRU values unrecovered. This default 
strategy is best for
    +most applications, because it strikes a balance between recovery speed and 
cache completeness.
    +
    +### Configuring Recovery of Persistent Regions
    +
    +Three Java system parameters allow the developer to control the recovery 
behavior for persistent regions:
    +
    +- `gemfire.disk.recoverValues`
    +
    +  Default = `true`, recover values. If `false`, recover only keys, do not 
recover values.
    +
    +  *How used:* When `true`, recovery of the values "warms up" the cache so 
data retrievals will find
    +  their values in the cache, without causing time consuming disk accesses. 
When `false`, shortens
    +  recovery time so the system becomes available for use sooner, but the 
first retrieval on each key
    +  will require a disk read.
    +
    +- `gemfire.disk.recoverLruValues`
    +
    +  Default = `false`, do not recover LRU values. If `true`, recover LRU 
values. If
    +  `gemfire.disk.recoverValues` is `false`, then 
`gemfire.disk.recoverLruValues` is ignored, since
    +  no values are recovered.
    +
    +  *How used:* When `false`, shortens recovery time by ignoring LRU values. 
When `true`, restores
    +  more data values to the cache. Recovery of the LRU values increases heap 
memory usage and
    +  could cause an out-of-memory error, preventing the system from 
restarting.
    +
    +- `gemfire.disk.recoverValuesSync`
    +
    +  Default = `false`, recover values by an asynchronous background process. 
If `true`, values are
    +  recovered synchronously, and recovery is not complete until all values 
have been retrieved.  If
    +  `gemfire.disk.recoverValues` is `false`, then 
`gemfire.disk.recoverValuesSync` is ignored since
    +  no values are recovered.
    +
    +  *How used:* When `false`, allows the system to become available sooner, 
but some time must elapse
    +  before the entire cache is refreshed. Some key retrievals will require 
disk access, and some will not.
    --- End diff --
    
    change "the entire cache is refreshed" to "all values have been read from 
disk into cache memory"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to