Merge branch '1.6' into 1.7 Conflicts: docs/src/main/latex/accumulo_user_manual/chapters/administration.tex
Project: http://git-wip-us.apache.org/repos/asf/accumulo/repo Commit: http://git-wip-us.apache.org/repos/asf/accumulo/commit/14e88e4b Tree: http://git-wip-us.apache.org/repos/asf/accumulo/tree/14e88e4b Diff: http://git-wip-us.apache.org/repos/asf/accumulo/diff/14e88e4b Branch: refs/heads/1.7 Commit: 14e88e4ba6fef601bca2fa1307b654173315f356 Parents: 5efa0bd 97a92a0 Author: Josh Elser <els...@apache.org> Authored: Mon Dec 28 13:37:16 2015 -0500 Committer: Josh Elser <els...@apache.org> Committed: Mon Dec 28 13:37:16 2015 -0500 ---------------------------------------------------------------------- .../main/asciidoc/chapters/administration.txt | 40 ++++++++++++++++++++ 1 file changed, 40 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/accumulo/blob/14e88e4b/docs/src/main/asciidoc/chapters/administration.txt ---------------------------------------------------------------------- diff --cc docs/src/main/asciidoc/chapters/administration.txt index 0a29711,0000000..919ec8f mode 100644,000000..100644 --- a/docs/src/main/asciidoc/chapters/administration.txt +++ b/docs/src/main/asciidoc/chapters/administration.txt @@@ -1,1106 -1,0 +1,1146 @@@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +== Administration + +=== Hardware + +Because we are running essentially two or three systems simultaneously layered +across the cluster: HDFS, Accumulo and MapReduce, it is typical for hardware to +consist of 4 to 8 cores, and 8 to 32 GB RAM. This is so each running process can have +at least one core and 2 - 4 GB each. + +One core running HDFS can typically keep 2 to 4 disks busy, so each machine may +typically have as little as 2 x 300GB disks and as much as 4 x 1TB or 2TB disks. + +It is possible to do with less than this, such as with 1u servers with 2 cores and 4GB +each, but in this case it is recommended to only run up to two processes per +machine -- i.e. DataNode and TabletServer or DataNode and MapReduce worker but +not all three. The constraint here is having enough available heap space for all the +processes on a machine. + +=== Network + +Accumulo communicates via remote procedure calls over TCP/IP for both passing +data and control messages. In addition, Accumulo uses HDFS clients to +communicate with HDFS. To achieve good ingest and query performance, sufficient +network bandwidth must be available between any two machines. + +In addition to needing access to ports associated with HDFS and ZooKeeper, Accumulo will +use the following default ports. Please make sure that they are open, or change +their value in conf/accumulo-site.xml. + +.Accumulo default ports +[width="75%",cols=">,^2,^2"] +[options="header"] +|==== +|Port | Description | Property Name +|4445 | Shutdown Port (Accumulo MiniCluster) | n/a +|4560 | Accumulo monitor (for centralized log display) | monitor.port.log4j +|9997 | Tablet Server | tserver.port.client +|9999 | Master Server | master.port.client +|12234 | Accumulo Tracer | trace.port.client +|42424 | Accumulo Proxy Server | n/a +|50091 | Accumulo GC | gc.port.client +|50095 | Accumulo HTTP monitor | monitor.port.client +|10001 | Master Replication service | master.replication.coordinator.port +|10002 | TabletServer Replication service | replication.receipt.service.port +|==== + +In addition, the user can provide +0+ and an ephemeral port will be chosen instead. This +ephemeral port is likely to be unique and not already bound. Thus, configuring ports to +use +0+ instead of an explicit value, should, in most cases, work around any issues of +running multiple distinct Accumulo instances (or any other process which tries to use the +same default ports) on the same hardware. + +=== Installation +Choose a directory for the Accumulo installation. This directory will be referenced +by the environment variable +$ACCUMULO_HOME+. Run the following: + + $ tar xzf accumulo-1.6.0-bin.tar.gz # unpack to subdirectory + $ mv accumulo-1.6.0 $ACCUMULO_HOME # move to desired location + +Repeat this step at each machine within the cluster. Usually all machines have the +same +$ACCUMULO_HOME+. + +=== Dependencies +Accumulo requires HDFS and ZooKeeper to be configured and running +before starting. Password-less SSH should be configured between at least the +Accumulo master and TabletServer machines. It is also a good idea to run Network +Time Protocol (NTP) within the cluster to ensure nodes' clocks don't get too out of +sync, which can cause problems with automatically timestamped data. + +=== Configuration + +Accumulo is configured by editing several Shell and XML files found in ++$ACCUMULO_HOME/conf+. The structure closely resembles Hadoop's configuration +files. + +Logging is primarily controlled using the log4j configuration files, ++generic_logger.xml+ and +monitor_logger.xml+ (or their corresponding ++.properties+ version if the +.xml+ version is missing). The generic logger is +used for most server types, and is typically configured to send logs to the +monitor, as well as log files. The monitor logger is used by the monitor, and +is typically configured to log only errors the monitor itself generates, +rather than all the logs that it receives from other server types. + +==== Edit conf/accumulo-env.sh + +Accumulo needs to know where to find the software it depends on. Edit accumulo-env.sh +and specify the following: + +. Enter the location of the installation directory of Accumulo for +$ACCUMULO_HOME+ +. Enter your system's Java home for +$JAVA_HOME+ +. Enter the location of Hadoop for +$HADOOP_PREFIX+ +. Choose a location for Accumulo logs and enter it for +$ACCUMULO_LOG_DIR+ +. Enter the location of ZooKeeper for +$ZOOKEEPER_HOME+ + +By default Accumulo TabletServers are set to use 1GB of memory. You may change +this by altering the value of +$ACCUMULO_TSERVER_OPTS+. Note the syntax is that of +the Java JVM command line options. This value should be less than the physical +memory of the machines running TabletServers. + +There are similar options for the master's memory usage and the garbage collector +process. Reduce these if they exceed the physical RAM of your hardware and +increase them, within the bounds of the physical RAM, if a process fails because of +insufficient memory. + +Note that you will be specifying the Java heap space in accumulo-env.sh. You should +make sure that the total heap space used for the Accumulo tserver and the Hadoop +DataNode and TaskTracker is less than the available memory on each slave node in +the cluster. On large clusters, it is recommended that the Accumulo master, Hadoop +NameNode, secondary NameNode, and Hadoop JobTracker all be run on separate +machines to allow them to use more heap space. If you are running these on the +same machine on a small cluster, likewise make sure their heap space settings fit +within the available memory. + +==== Native Map + +The tablet server uses a data structure called a MemTable to store sorted key/value +pairs in memory when they are first received from the client. When a minor compaction +occurs, this data structure is written to HDFS. The MemTable will default to using +memory in the JVM but a JNI version, called the native map, can be used to significantly +speed up performance by utilizing the memory space of the native operating system. The +native map also avoids the performance implications brought on by garbage collection +in the JVM by causing it to pause much less frequently. + +===== Building + +32-bit and 64-bit Linux and Mac OS X versions of the native map can be built +from the Accumulo bin package by executing ++$ACCUMULO_HOME/bin/build_native_library.sh+. If your system's +default compiler options are insufficient, you can add additional compiler +options to the command line, such as options for the architecture. These will be +passed to the Makefile in the environment variable +USERFLAGS+. + +Examples: + +. +$ACCUMULO_HOME/bin/build_native_library.sh+ +. +$ACCUMULO_HOME/bin/build_native_library.sh -m32+ + +After building the native map from the source, you will find the artifact in ++$ACCUMULO_HOME/lib/native+. Upon starting up, the tablet server will look +in this directory for the map library. If the file is renamed or moved from its +target directory, the tablet server may not be able to find it. The system can +also locate the native maps shared library by setting +LD_LIBRARY_PATH+ +(or +DYLD_LIBRARY_PATH+ on Mac OS X) in +$ACCUMULO_HOME/conf/accumulo-env.sh+. + +===== Native Maps Configuration + +As mentioned, Accumulo will use the native libraries if they are found in the expected +location and if it is not configured to ignore them. Using the native maps over JVM +Maps nets a noticable improvement in ingest rates; however, certain configuration +variables are important to modify when increasing the size of the native map. + +To adjust the size of the native map, increase the value of +tserver.memory.maps.max+. +By default, the maximum size of the native map is 1GB. When increasing this value, it is +also important to adjust the values of +table.compaction.minor.logs.threshold+ and ++tserver.walog.max.size+. +table.compaction.minor.logs.threshold+ is the maximum +number of write-ahead log files that a tablet can reference before they will be automatically +minor compacted. +tserver.walog.max.size+ is the maximum size of a write-ahead log. + +The maximum size of the native maps for a server should be less than the product +of the write-ahead log maximum size and minor compaction threshold for log files: + ++$table.compaction.minor.logs.threshold * $tserver.walog.max.size >= $tserver.memory.maps.max+ + +This formula ensures that minor compactions won't be automatically triggered before the native +maps can be completely saturated. + +Subsequently, when increasing the size of the write-ahead logs, it can also be important +to increase the HDFS block size that Accumulo uses when creating the files for the write-ahead log. +This is controlled via +tserver.wal.blocksize+. A basic recommendation is that when ++tserver.walog.max.size+ is larger than 2GB in size, set +tserver.wal.blocksize+ to 2GB. +Increasing the block size to a value larger than 2GB can result in decreased write +performance to the write-ahead log file which will slow ingest. + +==== Cluster Specification + +On the machine that will serve as the Accumulo master: + +. Write the IP address or domain name of the Accumulo Master to the +$ACCUMULO_HOME/conf/masters+ file. +. Write the IP addresses or domain name of the machines that will be TabletServers in +$ACCUMULO_HOME/conf/slaves+, one per line. + +Note that if using domain names rather than IP addresses, DNS must be configured +properly for all machines participating in the cluster. DNS can be a confusing source +of errors. + +==== Accumulo Settings +Specify appropriate values for the following settings in ++$ACCUMULO_HOME/conf/accumulo-site.xml+ : + +[source,xml] +<property> + <name>instance.zookeeper.host</name> + <value>zooserver-one:2181,zooserver-two:2181</value> + <description>list of zookeeper servers</description> +</property> + +This enables Accumulo to find ZooKeeper. Accumulo uses ZooKeeper to coordinate +settings between processes and helps finalize TabletServer failure. + +[source,xml] +<property> + <name>instance.secret</name> + <value>DEFAULT</value> +</property> + +The instance needs a secret to enable secure communication between servers. Configure your +secret and make sure that the +accumulo-site.xml+ file is not readable to other users. +For alternatives to storing the +instance.secret+ in plaintext, please read the ++Sensitive Configuration Values+ section. + +Some settings can be modified via the Accumulo shell and take effect immediately, but +some settings require a process restart to take effect. See the configuration documentation +(available in the docs directory of the tarball and in <<configuration>>) for details. + +One aspect of Accumulo's configuration which is different as compared to the rest of the Hadoop +ecosystem is that the server-process classpath is determined in part by multiple values. A +bootstrap classpath is based soley on the `accumulo-start.jar`, Log4j and `$ACCUMULO_CONF_DIR`. + +A second classloader is used to dynamically load all of the resources specified by `general.classpaths` +in `$ACCUMULO_CONF_DIR/accumulo-site.xml`. This value is a comma-separated list of regular-expression +paths which are all loaded into a secondary classloader. This includes Hadoop, Accumulo and ZooKeeper +jars necessary to run Accumulo. When this value is not defined, a default value is used which attempts +to load Hadoop from multiple potential locations depending on how Hadoop was installed. It is strongly +recommended that `general.classpaths` is defined and limited to only the necessary jars to prevent +extra jars from being unintentionally loaded into Accumulo processes. + +==== Hostnames in configuration files + +Accumulo has a number of configuration files which can contain references to other hosts in your +network. All of the "host" configuration files for Accumulo (+gc+, +masters+, +slaves+, +monitor+, ++tracers+) as well as +instance.volumes+ in accumulo-site.xml must contain some host reference. + +While IP address, short hostnames, or fully qualified domain names (FQDN) are all technically valid, it +is good practice to always use FQDNs for both Accumulo and other processes in your Hadoop cluster. +Failing to consistently use FQDNs can have unexpected consequences in how Accumulo uses the FileSystem. + +A common way for this problem can be observed is via applications that use Bulk Ingest. The Accumulo +Master coordinates moving the input files to Bulk Ingest to an Accumulo-managed directory. However, +Accumulo cannot safely move files across different Hadoop FileSystems. This is problematic because +Accumulo also cannot make reliable assertions across what is the same FileSystem which is specified +with different names. Naively, while 127.0.0.1:8020 might be a valid identifier for an HDFS instance, +Accumulo identifies +localhost:8020+ as a different HDFS instance than +127.0.0.1:8020+. + +==== Deploy Configuration + +Copy the masters, slaves, accumulo-env.sh, and if necessary, accumulo-site.xml +from the +$ACCUMULO_HOME/conf/+ directory on the master to all the machines +specified in the slaves file. + +==== Sensitive Configuration Values + +Accumulo has a number of properties that can be specified via the accumulo-site.xml +file which are sensitive in nature, instance.secret and trace.token.property.password +are two common examples. Both of these properties, if compromised, have the ability +to result in data being leaked to users who should not have access to that data. + +In Hadoop-2.6.0, a new CredentialProvider class was introduced which serves as a common +implementation to abstract away the storage and retrieval of passwords from plaintext +storage in configuration files. Any Property marked with the +Sensitive+ annotation +is a candidate for use with these CredentialProviders. For version of Hadoop which lack +these classes, the feature will just be unavailable for use. + +A comma separated list of CredentialProviders can be configured using the Accumulo Property ++general.security.credential.provider.paths+. Each configured URL will be consulted +when the Configuration object for accumulo-site.xml is accessed. + +==== Using a JavaKeyStoreCredentialProvider for storage + +One of the implementations provided in Hadoop-2.6.0 is a Java KeyStore CredentialProvider. +Each entry in the KeyStore is the Accumulo Property key name. For example, to store the +\texttt{instance.secret}, the following command can be used: + + hadoop credential create instance.secret --provider jceks://file/etc/accumulo/conf/accumulo.jceks + +The command will then prompt you to enter the secret to use and create a keystore in: + + /etc/accumulo/conf/accumulo.jceks + +Then, accumulo-site.xml must be configured to use this KeyStore as a CredentialProvider: + +[source,xml] +<property> + <name>general.security.credential.provider.paths</name> + <value>jceks://file/etc/accumulo/conf/accumulo.jceks</value> +</property> + +This configuration will then transparently extract the +instance.secret+ from +the configured KeyStore and alleviates a human readable storage of the sensitive +property. + +A KeyStore can also be stored in HDFS, which will make the KeyStore readily available to +all Accumulo servers. If the local filesystem is used, be aware that each Accumulo server +will expect the KeyStore in the same location. + +[[ClientConfiguration]] +==== Client Configuration + +In version 1.6.0, Accumulo includes a new type of configuration file known as a client +configuration file. One problem with the traditional "site.xml" file that is prevalent +through Hadoop is that it is a single file used by both clients and servers. This makes +is very difficult to protect secrets that are only meant for the server processes while +allowing the clients to connect to the servers. + +The client configuration file is a subset of the information stored in accumulo-site.xml +meant only for consumption by clients of Accumulo. By default, Accumulo checks a number +of locations for a client configuration by default: + +* +${ACCUMULO_CONF_DIR}/client.conf+ +* +/etc/accumulo/client.conf+ +* +/etc/accumulo/conf/client.conf+ +* +~/.accumulo/config+ + +These files are https://en.wikipedia.org/wiki/.properties[Java Properties files]. These files +can currently contain information about ZooKeeper servers, RPC properties (such as SSL or SASL +connectors), distributed tracing properties. Valid properties are defined by the +https://github.com/apache/accumulo/blob/f1d0ec93d9f13ff84844b5ac81e4a7b383ced467/core/src/main/java/org/apache/accumulo/core/client/ClientConfiguration.java#L54[ClientProperty] +enum contained in the client API. + +==== Custom Table Tags + +Accumulo has the ability for users to add custom tags to tables. This allows +applications to set application-level metadata about a table. These tags can be +anything from a table description, administrator notes, date created, etc. +This is done by naming and setting a property with a prefix +table.custom.*+. + +Currently, table properties are stored in ZooKeeper. This means that the number +and size of custom properties should be restricted on the order of 10's of properties +at most without any properties exceeding 1MB in size. ZooKeeper's performance can be +very sensitive to an excessive number of nodes and the sizes of the nodes. Applications +which leverage the user of custom properties should take these warnings into +consideration. There is no enforcement of these warnings via the API. + +=== Initialization + +Accumulo must be initialized to create the structures it uses internally to locate +data across the cluster. HDFS is required to be configured and running before +Accumulo can be initialized. + +Once HDFS is started, initialization can be performed by executing ++$ACCUMULO_HOME/bin/accumulo init+ . This script will prompt for a name +for this instance of Accumulo. The instance name is used to identify a set of tables +and instance-specific settings. The script will then write some information into +HDFS so Accumulo can start properly. + +The initialization script will prompt you to set a root password. Once Accumulo is +initialized it can be started. + +=== Running + +==== Starting Accumulo + +Make sure Hadoop is configured on all of the machines in the cluster, including +access to a shared HDFS instance. Make sure HDFS and ZooKeeper are running. +Make sure ZooKeeper is configured and running on at least one machine in the +cluster. +Start Accumulo using the +bin/start-all.sh+ script. + +To verify that Accumulo is running, check the Status page as described in +<<monitoring>>. In addition, the Shell can provide some information about the status of +tables via reading the metadata tables. + +==== Stopping Accumulo + +To shutdown cleanly, run +bin/stop-all.sh+ and the master will orchestrate the +shutdown of all the tablet servers. Shutdown waits for all minor compactions to finish, so it may +take some time for particular configurations. + +==== Adding a Node + +Update your +$ACCUMULO_HOME/conf/slaves+ (or +$ACCUMULO_CONF_DIR/slaves+) file to account for the addition. + + $ACCUMULO_HOME/bin/accumulo admin start <host(s)> {<host> ...} + +Alternatively, you can ssh to each of the hosts you want to add and run: + + $ACCUMULO_HOME/bin/start-here.sh + +Make sure the host in question has the new configuration, or else the tablet +server won't start; at a minimum this needs to be on the host(s) being added, +but in practice it's good to ensure consistent configuration across all nodes. + +==== Decomissioning a Node + +If you need to take a node out of operation, you can trigger a graceful shutdown of a tablet +server. Accumulo will automatically rebalance the tablets across the available tablet servers. + + $ACCUMULO_HOME/bin/accumulo admin stop <host(s)> {<host> ...} + +Alternatively, you can ssh to each of the hosts you want to remove and run: + + $ACCUMULO_HOME/bin/stop-here.sh + +Be sure to update your +$ACCUMULO_HOME/conf/slaves+ (or +$ACCUMULO_CONF_DIR/slaves+) file to +account for the removal of these hosts. Bear in mind that the monitor will not re-read the +slaves file automatically, so it will report the decomissioned servers as down; it's +recommended that you restart the monitor so that the node list is up to date. + +==== Restarting process on a node + +Occasionally, it might be necessary to restart the processes on a specific node. In addition +to the +start-all.sh+ and +stop-all.sh+ scripts, Accumulo contains scripts to start/stop all processes +on a node and start/stop a given process on a node. + ++start-here.sh+ and +stop-here.sh+ will start/stop all Accumulo processes on the current node. The +necessary processes to start/stop are determined via the "hosts" files (e.g. slaves, masters, etc). +These scripts expect no arguments. + ++start-server.sh+ can also be useful in starting a given process on a host. +The first argument to the process is the hostname of the machine. Use the same host that +you specified in hosts file (if you specified FQDN in the masters file, use the FQDN, not +the shortname). The second argument is the name of the process to start (e.g. master, tserver). + +The steps described to decomission a node can also be used (without removal of the host +from the +$ACCUMULO_HOME/conf/slaves+ file) to gracefully stop a node. This will +ensure that the tabletserver is cleanly stopped and recovery will not need to be performed +when the tablets are re-hosted. + ++==== Running multiple TabletServers on a single node ++ ++With very powerful nodes, it may be beneficial to run more than one TabletServer on a given ++node. This decision should be made carefully and with much deliberation as Accumulo is designed ++to be able to scale to using 10's of GB of RAM and 10's of CPU cores. ++ ++To run multiple TabletServers on a single host, it is necessary to create multiple Accumulo configuration ++directories. Ensuring that these properties are appropriately set (and remain consistent) are an exercise ++for the user. ++ ++Accumulo TabletServers bind certain ports on the host to accommodate remote procedure calls to/from ++other nodes. This requires additional configuration values in +accumulo-site.xml+: ++ ++* +tserver.port.client+ ++* +replication.receipt.service.port+ ++ ++Normally, setting a value of +0+ for these configuration properties is sufficient. In some ++environment, the ports used by Accumulo must be well-known for security reasons and require a ++separate copy of the configuration files to use a static port for each TabletServer instance. ++ ++It is also necessary to update the following exported variables in +accumulo-env.sh+. ++ ++* +ACCUMULO_LOG_DIR+ ++ ++The values for these properties are left up to the user to define; there are no constraints ++other than ensuring that the directory exists and the user running Accumulo has the permission ++to read/write into that directory. ++ ++Accumulo's provided scripts for stopping a cluster operate under the assumption that one process ++is running per host. As such, starting and stopping multiple TabletServers on one host requires ++more effort on the user. It is important to ensure that +ACCUMULO_CONF_DIR+ is correctly ++set for the instance of the TabletServer being started. ++ ++ $ACCUMULO_CONF_DIR=$ACCUMULO_HOME/conf $ACCUMULO_HOME/bin/accumulo tserver --address <your_server_ip> & ++ ++To stop TabletServers, the normal +stop-all.sh+ will stop all instances of TabletServers across all nodes. ++Using the provided +kill+ command by your operation system is an option to stop a single instance on ++a single node. +stop-server.sh+ can be used to stop all TabletServers on a single node. ++ ++ +[[monitoring]] +=== Monitoring + +==== Accumulo Monitor +The Accumulo Monitor provides an interface for monitoring the status and health of +Accumulo components. The Accumulo Monitor provides a web UI for accessing this information at ++http://_monitorhost_:50095/+. + +Things highlighted in yellow may be in need of attention. +If anything is highlighted in red on the monitor page, it is something that definitely needs attention. + +The Overview page contains some summary information about the Accumulo instance, including the version, instance name, and instance ID. +There is a table labeled Accumulo Master with current status, a table listing the active Zookeeper servers, and graphs displaying various metrics over time. +These include ingest and scan performance and other useful measurements. + +The Master Server, Tablet Servers, and Tables pages display metrics grouped in different ways (e.g. by tablet server or by table). +Metrics typically include number of entries (key/value pairs), ingest and query rates. +The number of running scans, major and minor compactions are in the form _number_running_ (_number_queued_). +Another important metric is hold time, which is the amount of time a tablet has been waiting but unable to flush its memory in a minor compaction. + +The Server Activity page graphically displays tablet server status, with each server represented as a circle or square. +Different metrics may be assigned to the nodes' color and speed of oscillation. +The Overall Avg metric is only used on the Server Activity page, and represents the average of all the other metrics (after normalization). +Similarly, the Overall Max metric picks the metric with the maximum normalized value. + +The Garbage Collector page displays a list of garbage collection cycles, the number of files found of each type (including deletion candidates in use and files actually deleted), and the length of the deletion cycle. +The Traces page displays data for recent traces performed (see the following section for information on <<tracing>>). +The Recent Logs page displays warning and error logs forwarded to the monitor from all Accumulo processes. +Also, the XML and JSON links provide metrics in XML and JSON formats, respectively. + +==== SSL +SSL may be enabled for the monitor page by setting the following properties in the +accumulo-site.xml+ file: + + monitor.ssl.keyStore + monitor.ssl.keyStorePassword + monitor.ssl.trustStore + monitor.ssl.trustStorePassword + +If the Accumulo conf directory has been configured (in particular the +accumulo-env.sh+ file must be set up), the +generate_monitor_certificate.sh+ script in the Accumulo +bin+ directory can be used to create the keystore and truststore files with random passwords. +The script will print out the properties that need to be added to the +accumulo-site.xml+ file. +The stores can also be generated manually with the Java +keytool+ command, whose usage can be seen in the +generate_monitor_certificate.sh+ script. + +If desired, the SSL ciphers allowed for connections can be controlled via the following properties in +accumulo-site.xml+: + + monitor.ssl.include.ciphers + monitor.ssl.exclude.ciphers + +If SSL is enabled, the monitor URL can only be accessed via https. +This also allows you to access the Accumulo shell through the monitor page. +The left navigation bar will have a new link to Shell. +An Accumulo user name and password must be entered for access to the shell. + +=== Metrics +Accumulo is capable of using the Hadoop Metrics2 library and is configured by default to use it. Metrics2 is a library +which allows for routing of metrics generated by registered MetricsSources to configured MetricsSinks. Examples of sinks +that are implemented by Hadoop include file-based logging, Graphite and Ganglia. All metric sources are exposed via JMX +when using Metrics2. + +Previous to Accumulo 1.7.0, JMX endpoints could be exposed in addition to file-based logging of those metrics configured via +the +accumulo-metrics.xml+ file. This mechanism can still be used by setting +general.legacy.metrics+ to +true+ in +accumulo-site.xml+. + +==== Metrics2 Configuration + +Metrics2 is configured by examining the classpath for a file that matches +hadoop-metrics2*.properties+. The example configuration +files that Accumulo provides for use include +hadoop-metrics2-accumulo.properties+ as a template which can be used to enable +file, Graphite or Ganglia sinks (some minimal configuration required for Graphite and Ganglia). Because the Hadoop configuration is +also on the Accumulo classpath, be sure that you do not have multiple Metrics2 configuration files. It is recommended to consolidate +metrics in a single properties file in a central location to remove ambiguity. The contents of +hadoop-metrics2-accumulo.properties+ +can be added to a central +hadoop-metrics2.properties+ in +$HADOOP_CONF_DIR+. + +As a note for configuring the file sink, the provided path should be absolute. A relative path or file name will be created relative +to the directory in which the Accumulo process was started. External tools, such as logrotate, can be used to prevent these files +from growing without bound. + +Each server process should have log messages from the Metrics2 library about the sinks that were created. Be sure to check +the Accumulo processes log files when debugging missing metrics output. + +For additional information on configuring Metrics2, visit the +https://hadoop.apache.org/docs/current/api/org/apache/hadoop/metrics2/package-summary.html[Javadoc page for Metrics2]. + +[[tracing]] +=== Tracing +It can be difficult to determine why some operations are taking longer +than expected. For example, you may be looking up items with very low +latency, but sometimes the lookups take much longer. Determining the +cause of the delay is difficult because the system is distributed, and +the typical lookup is fast. + +Accumulo has been instrumented to record the time that various +operations take when tracing is turned on. The fact that tracing is +enabled follows all the requests made on behalf of the user throughout +the distributed infrastructure of accumulo, and across all threads of +execution. + +These time spans will be inserted into the +trace+ table in +Accumulo. You can browse recent traces from the Accumulo monitor +page. You can also read the +trace+ table directly like any +other table. + +The design of Accumulo's distributed tracing follows that of +http://research.google.com/pubs/pub36356.html[Google's Dapper]. + +==== Tracers +To collect traces, Accumulo needs at least one server listed in + +$ACCUMULO_HOME/conf/tracers+. The server collects traces +from clients and writes them to the +trace+ table. The Accumulo +user that the tracer connects to Accumulo with can be configured with +the following properties +(see the <<configuration,Configuration>> section for setting Accumulo server properties) + + trace.user + trace.token.property.password + +Other tracer configuration properties include + + trace.port.client - port tracer listens on + trace.table - table tracer writes to + trace.zookeeper.path - zookeeper path where tracers register + +The zookeeper path is configured to /tracers by default. If +multiple Accumulo instances are sharing the same ZooKeeper +quorum, take care to configure Accumulo with unique values for +this property. + +==== Configuring Tracing +Traces are collected via SpanReceivers. The default SpanReceiver +configured is org.apache.accumulo.core.trace.ZooTraceClient, which +sends spans to an Accumulo Tracer process, as discussed in the +previous section. This default can be changed to a different span +receiver, or additional span receivers can be added in a +comma-separated list, by modifying the property + + trace.span.receivers + +Individual span receivers may require their own configuration +parameters, which are grouped under the trace.span.receiver.* +prefix. ZooTraceClient uses the following properties. The first +three properties are populated from other Accumulo properties, +while the remaining ones should be prefixed with +trace.span.receiver. when set in the Accumulo configuration. + + tracer.zookeeper.host - populated from instance.zookeepers + tracer.zookeeper.timeout - populated from instance.zookeeper.timeout + tracer.zookeeper.path - populated from trace.zookeeper.path + tracer.send.timer.millis - timer for flushing send queue (in ms, default 1000) + tracer.queue.size - max queue size (default 5000) + tracer.span.min.ms - minimum span length to store (in ms, default 1) + +Note that to configure an Accumulo client for tracing, including +the Accumulo shell, the client configuration must be given the same +trace.span.receivers, trace.span.receiver.*, and trace.zookeeper.path +properties as the servers have. + +Hadoop can also be configured to send traces to Accumulo, as of +Hadoop 2.6.0, by setting properties in Hadoop's core-site.xml +file. Instead of using the trace.span.receiver.* prefix, Hadoop +uses hadoop.htrace.*. The Hadoop configuration does not have +access to Accumulo's properties, so the +hadoop.htrace.tracer.zookeeper.host property must be specified. +The zookeeper timeout defaults to 30000 (30 seconds), and the +zookeeper path defaults to /tracers. An example of configuring +Hadoop to send traces to ZooTraceClient is + + <property> + <name>hadoop.htrace.spanreceiver.classes</name> + <value>org.apache.accumulo.core.trace.ZooTraceClient</value> + </property> + <property> + <name>hadoop.htrace.tracer.zookeeper.host</name> + <value>zookeeperHost:2181</value> + </property> + <property> + <name>hadoop.htrace.tracer.zookeeper.path</name> + <value>/tracers</value> + </property> + <property> + <name>hadoop.htrace.tracer.span.min.ms</name> + <value>1</value> + </property> + +The accumulo-core, accumulo-tracer, accumulo-fate and libthrift +jars must also be placed on Hadoop's classpath. + +===== Adding additional SpanReceivers +https://github.com/openzipkin/zipkin[Zipkin] +has a SpanReceiver supported by HTrace and popularized by Twitter +that users looking for a more graphical trace display may opt to use. +The following steps configure Accumulo to use +org.apache.htrace.impl.ZipkinSpanReceiver+ +in addition to the Accumulo's default ZooTraceClient, and they serve as a template +for adding any SpanReceiver to Accumulo: + +1. Add the Jar containing the ZipkinSpanReceiver class file to the ++$ACCUMULO_HOME/lib/+. It is critical that the Jar is placed in ++lib/+ and NOT in +lib/ext/+ so that the new SpanReceiver class +is visible to the same class loader of htrace-core. + +2. Add the following to +$ACCUMULO_HOME/conf/accumulo-site.xml+: + + <property> + <name>trace.span.receivers</name> + <value>org.apache.accumulo.tracer.ZooTraceClient,org.apache.htrace.impl.ZipkinSpanReceiver</value> + </property> + +3. Restart your Accumulo tablet servers. + +In order to use ZipkinSpanReceiver from a client as well as the Accumulo server, + +1. Ensure your client can see the ZipkinSpanReceiver class at runtime. For Maven projects, +this is easily done by adding to your client's pom.xml (taking care to specify a good version) + + <dependency> + <groupId>org.apache.htrace</groupId> + <artifactId>htrace-zipkin</artifactId> + <version>3.1.0-incubating</version> + <scope>runtime</scope> + </dependency> + +2. Add the following to your ClientConfiguration +(see the <<ClientConfiguration>> section) + + trace.span.receivers=org.apache.accumulo.tracer.ZooTraceClient,org.apache.htrace.impl.ZipkinSpanReceiver + +3. Instrument your client as in the next section. + +Your SpanReceiver may require additional properties, and if so these should likewise +be placed in the ClientConfiguration (if applicable) and Accumulo's +accumulo-site.xml+. +Two such properties for ZipkinSpanReceiver, listed with their default values, are + + <property> + <name>trace.span.receiver.zipkin.collector-hostname</name> + <value>localhost</value> + </property> + <property> + <name>trace.span.receiver.zipkin.collector-port</name> + <value>9410</value> + </property> + +==== Instrumenting a Client +Tracing can be used to measure a client operation, such as a scan, as +the operation traverses the distributed system. To enable tracing for +your application call + +[source,java] +import org.apache.accumulo.core.trace.DistributedTrace; +... +DistributedTrace.enable(hostname, "myApplication"); +// do some tracing +... +DistributedTrace.disable(); + +Once tracing has been enabled, a client can wrap an operation in a trace. + +[source,java] +import org.apache.htrace.Sampler; +import org.apache.htrace.Trace; +import org.apache.htrace.TraceScope; +... +TraceScope scope = Trace.startSpan("Client Scan", Sampler.ALWAYS); +BatchScanner scanner = conn.createBatchScanner(...); +// Configure your scanner +for (Entry entry : scanner) { +} +scope.close(); + +The user can create additional Spans within a Trace. + +The sampler (such as +Sampler.ALWAYS+) for the trace should only be specified with a top-level span, +and subsequent spans will be collected depending on whether that first span was sampled. +Don't forget to specify a Sampler at the top-level span +because the default Sampler only samples when part of a pre-existing trace, +which will never occur in a client that never specifies a Sampler. + +[source,java] +TraceScope scope = Trace.startSpan("Client Update", Sampler.ALWAYS); +... +TraceScope readScope = Trace.startSpan("Read"); +... +readScope.close(); +... +TraceScope writeScope = Trace.startSpan("Write"); +... +writeScope.close(); +scope.close(); + +Like Dapper, Accumulo tracing supports user defined annotations to associate additional data with a Trace. +Checking whether currently tracing is necessary when using a sampler other than Sampler.ALWAYS. + +[source,java] +... +int numberOfEntriesRead = 0; +TraceScope readScope = Trace.startSpan("Read"); +// Do the read, update the counter +... +if (Trace.isTracing) + readScope.getSpan().addKVAnnotation("Number of Entries Read".getBytes(StandardCharsets.UTF_8), + String.valueOf(numberOfEntriesRead).getBytes(StandardCharsets.UTF_8)); + +It is also possible to add timeline annotations to your spans. +This associates a string with a given timestamp between the start and stop times for a span. + +[source,java] +... +writeScope.getSpan().addTimelineAnnotation("Initiating Flush"); + +Some client operations may have a high volume within your +application. As such, you may wish to only sample a percentage of +operations for tracing. As seen below, the CountSampler can be used to +help enable tracing for 1-in-1000 operations + +[source,java] +import org.apache.htrace.impl.CountSampler; +... +Sampler sampler = new CountSampler(HTraceConfiguration.fromMap( + Collections.singletonMap(CountSampler.SAMPLER_FREQUENCY_CONF_KEY, "1000"))); +... +TraceScope readScope = Trace.startSpan("Read", sampler); +... +readScope.close(); + +Remember to close all spans and disable tracing when finished. + +[source,java] +DistributedTrace.disable(); + +==== Viewing Collected Traces +To view collected traces, use the "Recent Traces" link on the Monitor +UI. You can also programmatically access and print traces using the ++TraceDump+ class. + +===== Trace Table Format +This section is for developers looking to use data recorded in the trace table +directly, above and beyond the default services of the Accumulo monitor. +Please note the trace table format and its supporting classes +are not in the public API and may be subject to change in future versions. + +Each span received by a tracer's ZooTraceClient is recorded in the trace table +in the form of three entries: span entries, index entries, and start time entries. +Span and start time entries record full span information, +whereas index entries provide indexing into span information +useful for quickly finding spans by type or start time. + +Each entry is illustrated by a description and sample of data. +In the description, a token in quotes is a String literal, +whereas other other tokens are span variables. +Parentheses group parts together, to distinguish colon characters inside the +column family or qualifier from the colon that separates column family and qualifier. +We use the format +row columnFamily:columnQualifier columnVisibility value+ +(omitting timestamp which records the time an entry is written to the trace table). + +Span entries take the following form: + + traceId "span":(parentSpanId:spanId) [] spanBinaryEncoding + 63b318de80de96d1 span:4b8f66077df89de1:3778c6739afe4e1 [] %18;%09;... + +The parentSpanId is "" for the root span of a trace. +The spanBinaryEncoding is a compact Apache Thrift encoding of the original Span object. +This allows clients (and the Accumulo monitor) to recover all the details of the original Span +at a later time, by scanning the trace table and decoding the value of span entries +via +TraceFormatter.getRemoteSpan(entry)+. + +The trace table has a formatter class by default (org.apache.accumulo.tracer.TraceFormatter) +that changes how span entries appear from the Accumulo shell. +Normal scans to the trace table do not use this formatter representation; +it exists only to make span entries easier to view inside the Accumulo shell. + +Index entries take the following form: + + "idx":service:startTime description:sender [] traceId:elapsedTime + idx:tserver:14f3828f58b startScan:localhost [] 63b318de80de96d1:1 + +The service and sender are set by the first call of each Accumulo process +(and instrumented client processes) to +DistributedTrace.enable(...)+ +(the sender is autodetected if not specified). +The description is specified in each span. +Start time and the elapsed time (start - stop, 1 millisecond in the example above) +are recorded in milliseconds as long values serialized to a string in hex. + +Start time entries take the following form: + + "start":startTime "id":traceId [] spanBinaryEncoding + start:14f3828a351 id:63b318de80de96d1 [] %18;%09;... + +The following classes may be run from $ACCUMULO_HOME while Accumulo is running +to provide insight into trace statistics. These require +accumulo-trace-VERSION.jar to be provided on the Accumulo classpath +(+$ACCUMULO_HOME/lib/ext+ is fine). + + $ bin/accumulo org.apache.accumulo.tracer.TraceTableStats -u username -p password -i instancename + $ bin/accumulo org.apache.accumulo.tracer.TraceDump -u username -p password -i instancename -r + +==== Tracing from the Shell +You can enable tracing for operations run from the shell by using the ++trace on+ and +trace off+ commands. + +---- +root@test test> trace on + +root@test test> scan +a b:c [] d + +root@test test> trace off +Waiting for trace information +Waiting for trace information +Trace started at 2013/08/26 13:24:08.332 +Time Start Service@Location Name + 3628+0 shell@localhost shell:root + 8+1690 shell@localhost scan + 7+1691 shell@localhost scan:location + 6+1692 tserver@localhost startScan + 5+1692 tserver@localhost tablet read ahead 6 +---- + +=== Logging +Accumulo processes each write to a set of log files. By default these are found under ++$ACCUMULO/logs/+. + +[[watcher]] +=== Watcher +Accumulo includes scripts to automatically restart server processes in the case +of intermittent failures. To enable this watcher, edit +conf/accumulo-env.sh+ +to include the following: + +.... +# Should process be automatically restarted +export ACCUMULO_WATCHER="true" + +# What settings should we use for the watcher, if enabled +export UNEXPECTED_TIMESPAN="3600" +export UNEXPECTED_RETRIES="2" + +export OOM_TIMESPAN="3600" +export OOM_RETRIES="5" + +export ZKLOCK_TIMESPAN="600" +export ZKLOCK_RETRIES="5" +.... + +When an Accumulo process dies, the watcher will look at the logs and exit codes +to determine how the process failed and either restart or fail depending on the +recent history of failures. The restarting policy for various failure conditions +is configurable through the +*_TIMESPAN+ and +*_RETRIES+ variables shown above. + +=== Recovery + +In the event of TabletServer failure or error on shutting Accumulo down, some +mutations may not have been minor compacted to HDFS properly. In this case, +Accumulo will automatically reapply such mutations from the write-ahead log +either when the tablets from the failed server are reassigned by the Master (in the +case of a single TabletServer failure) or the next time Accumulo starts (in the event of +failure during shutdown). + +Recovery is performed by asking a tablet server to sort the logs so that tablets can easily find their missing +updates. The sort status of each file is displayed on +Accumulo monitor status page. Once the recovery is complete any +tablets involved should return to an ``online'' state. Until then those tablets will be +unavailable to clients. + +The Accumulo client library is configured to retry failed mutations and in many +cases clients will be able to continue processing after the recovery process without +throwing an exception. + +=== Migrating Accumulo from non-HA Namenode to HA Namenode + +The following steps will allow a non-HA instance to be migrated to an HA instance. Consider an HDFS URL ++hdfs://namenode.example.com:8020+ which is going to be moved to +hdfs://nameservice1+. + +Before moving HDFS over to the HA namenode, use +$ACCUMULO_HOME/bin/accumulo admin volumes+ to confirm +that the only volume displayed is the volume from the current namenode's HDFS URL. + + Listing volumes referenced in zookeeper + Volume : hdfs://namenode.example.com:8020/accumulo + + Listing volumes referenced in accumulo.root tablets section + Volume : hdfs://namenode.example.com:8020/accumulo + Listing volumes referenced in accumulo.root deletes section (volume replacement occurrs at deletion time) + + Listing volumes referenced in accumulo.metadata tablets section + Volume : hdfs://namenode.example.com:8020/accumulo + + Listing volumes referenced in accumulo.metadata deletes section (volume replacement occurrs at deletion time) + +After verifying the current volume is correct, shut down the cluster and transition HDFS to the HA nameservice. + +Edit +$ACCUMULO_HOME/conf/accumulo-site.xml+ to notify accumulo that a volume is being replaced. First, +add the new nameservice volume to the +instance.volumes+ property. Next, add the ++instance.volumes.replacements+ property in the form of +old new+. It's important to not include +the volume that's being replaced in +instance.volumes+, otherwise it's possible accumulo could continue +to write to the volume. + +[source,xml] +<!-- instance.dfs.uri and instance.dfs.dir should not be set--> +<property> + <name>instance.volumes</name> + <value>hdfs://nameservice1/accumulo</value> +</property> +<property> + <name>instance.volumes.replacements</name> + <value>hdfs://namenode.example.com:8020/accumulo hdfs://nameservice1/accumulo</value> +</property> + +Run +$ACCUMULO_HOME/bin/accumulo init --add-volumes+ and start up the accumulo cluster. Verify that the +new nameservice volume shows up with +$ACCUMULO_HOME/bin/accumulo admin volumes+. + + + Listing volumes referenced in zookeeper + Volume : hdfs://namenode.example.com:8020/accumulo + Volume : hdfs://nameservice1/accumulo + + Listing volumes referenced in accumulo.root tablets section + Volume : hdfs://namenode.example.com:8020/accumulo + Volume : hdfs://nameservice1/accumulo + Listing volumes referenced in accumulo.root deletes section (volume replacement occurrs at deletion time) + + Listing volumes referenced in accumulo.metadata tablets section + Volume : hdfs://namenode.example.com:8020/accumulo + Volume : hdfs://nameservice1/accumulo + Listing volumes referenced in accumulo.metadata deletes section (volume replacement occurrs at deletion time) + +Some erroneous GarbageCollector messages may still be seen for a small period while data is transitioning to +the new volumes. This is expected and can usually be ignored. + + + + + + + + + +=== Achieving Stability in a VM Environment + +For testing, demonstration, and even operation uses, Accumulo is often +installed and run in a virtual machine (VM) environment. The majority of +long-term operational uses of Accumulo are on bare-metal cluster. However, the +core design of Accumulo and its dependencies do not preclude running stably for +long periods within a VM. Many of Accumuloâs operational robustness features to +handle failures like periodic network partitioning in a large cluster carry +over well to VM environments. This guide covers general recommendations for +maximizing stability in a VM environment, including some of the common failure +modes that are more common when running in VMs. + +==== Known failure modes: Setup and Troubleshooting +In addition to the general failure modes of running Accumulo, VMs can introduce a +couple of environmental challenges that can affect process stability. Clock +drift is something that is more common in VMs, especially when VMs are +suspended and resumed. Clock drift can cause Accumulo servers to assume that +they have lost connectivity to the other Accumulo processes and/or lose their +locks in Zookeeper. VM environments also frequently have constrained resources, +such as CPU, RAM, network, and disk throughput and capacity. Accumulo generally +deals well with constrained resources from a stability perspective (optimizing +performance will require additional tuning, which is not covered in this +section), however there are some limits. + +===== Physical Memory +One of those limits has to do with the Linux out of memory killer. A common +failure mode in VM environments (and in some bare metal installations) is when +the Linux out of memory killer decides to kill processes in order to avoid a +kernel panic when provisioning a memory page. This often happens in VMs due to +the large number of processes that must run in a small memory footprint. In +addition to the Linux core processes, a single-node Accumulo setup requires a +Hadoop Namenode, a Hadoop Secondary Namenode a Hadoop Datanode, a Zookeeper +server, an Accumulo Master, an Accumulo GC and an Accumulo TabletServer. +Typical setups also include an Accumulo Monitor, an Accumulo Tracer, a Hadoop +ResourceManager, a Hadoop NodeManager, provisioning software, and client +applications. Between all of these processes, it is not uncommon to +over-subscribe the available RAM in a VM. We recommend setting up VMs without +swap enabled, so rather than performance grinding to a halt when physical +memory is exhausted the kernel will randomly* select processes to kill in order +to free up memory. + +Calculating the maximum possible memory usage is essential in creating a stable +Accumulo VM setup. Safely engineering memory allocation for stability is a +matter of then bringing the calculated maximum memory usage under the physical +memory by a healthy margin. The margin is to account for operating system-level +operations, such as managing process, maintaining virtual memory pages, and +file system caching. When the java out-of-memory killer finds your process, you +will probably only see evidence of that in /var/log/messages. Out-of-memory +process kills do not show up in Accumulo or Hadoop logs. + +To calculate the max memory usage of all java virtual machine (JVM) processes +add the maximum heap size (often limited by a -Xmx... argument, such as in +accumulo-site.xml) and the off-heap memory usage. Off-heap memory usage +includes the following: + +* "Permanent Space", where the JVM stores Classes, Methods, and other code elements. This can be limited by a JVM flag such as +-XX:MaxPermSize:100m+, and is typically tens of megabytes. +* Code generation space, where the JVM stores just-in-time compiled code. This is typically small enough to ignore +* Socket buffers, where the JVM stores send and receive buffers for each socket. +* Thread stacks, where the JVM allocates memory to manage each thread. +* Direct memory space and JNI code, where applications can allocate memory outside of the JVM-managed space. For Accumulo, this includes the native in-memory maps that are allocated with the memory.maps.max parameter in accumulo-site.xml. +* Garbage collection space, where the JVM stores information used for garbage collection. + +You can assume that each Hadoop and Accumulo process will use ~100-150MB for +Off-heap memory, plus the in-memory map of the Accumulo TServer process. A +simple calculation for physical memory requirements follows: + +.... + Physical memory needed + = (per-process off-heap memory) + (heap memory) + (other processes) + (margin) + = (number of java processes * 150M + native map) + (sum of -Xmx settings for java process) + (total applications memory, provisioning memory, etc.) + (1G) + = (11*150M +500M) + (1G +1G +1G +256M +1G +256M +512M +512M +512M +512M +512M) + (2G) + (1G) + = (2150M) + (7G) + (2G) + (1G) + = ~12GB +.... + +These calculations can add up quickly with the large number of processes, +especially in constrained VM environments. To reduce the physical memory +requirements, it is a good idea to reduce maximum heap limits and turn off +unnecessary processes. If you're not using YARN in your application, you can +turn off the ResourceManager and NodeManager. If you're not expecting to +re-provision the cluster frequently you can turn off or reduce provisioning +processes such as Salt Stack minions and masters. + +===== Disk Space +Disk space is primarily used for two operations: storing data and storing logs. +While Accumulo generally stores all of its key/value data in HDFS, Accumulo, +Hadoop, and Zookeeper all store a significant amount of logs in a directory on +a local file system. Care should be taken to make sure that (a) limitations to +the amount of logs generated are in place, and (b) enough space is available to +host the generated logs on the partitions that they are assigned. When space is +not available to log, processes will hang. This can cause interruptions in +availability of Accumulo, as well as cascade into failures of various +processes. + +Hadoop, Accumulo, and Zookeeper use log4j as a logging mechanism, and each of +them has a way of limiting the logs and directing them to a particular +directory. Logs are generated independently for each process, so when +considering the total space you need to add up the maximum logs generated by +each process. Typically, a rolling log setup in which each process can generate +something like 10 100MB files is instituted, resulting in a maximum file system +usage of 1GB per process. Default setups for Hadoop and Zookeeper are often +unbounded, so it is important to set these limits in the logging configuration +files for each subsystem. Consult the user manual for each system for +instructions on how to limit generated logs. + +===== Zookeeper Interaction +Accumulo is designed to scale up to thousands of nodes. At that scale, +intermittent interruptions in network service and other rare failures of +compute nodes become more common. To limit the impact of node failures on +overall service availability, Accumulo uses a heartbeat monitoring system that +leverages Zookeeper's ephemeral locks. There are several conditions that can +occur that cause Accumulo process to lose their Zookeeper locks, some of which +are true interruptions to availability and some of which are false positives. +Several of these conditions become more common in VM environments, where they +can be exacerbated by resource constraints and clock drift. + +Accumulo includes a mechanism to limit the impact of the false positives known +as the <<watcher>>. The watcher monitors Accumulo processes and will restart +them when they fail for certain reasons. The watcher can be configured within +the accumulo-env.sh file inside of Accumulo's configuration directory. We +recommend using the watcher to monitor Accumulo processes, as it will restore +the system to full capacity without administrator interaction after many of the +common failure modes. + +==== Tested Versions +Another large consideration for Accumulo stability is to use versions of +software that have been tested together in a VM environment. Any cluster of +processes that have not been tested together are likely to expose running +conditions that vary from the environments individually tested in the various +components. For example, Accumulo's use of HDFS includes many short block +reads, which differs from the more common full file read used in most +map/reduce applications. We have found that certain versions of Accumulo and +Hadoop will include stability bugs that greatly affect overall stability. In +our testing, Accumulo 1.6.2, Hadoop 2.6.0, and Zookeeper 3.4.6 resulted in a +stable VM clusters that did not fail a month of testing, while Accumulo 1.6.1, +Hadoop 2.5.1, and Zookeeper 3.4.5 had a mean time between failure of less than +a week under heavy ingest and query load. We expect that results will vary with +other configurations, and you should choose your software versions with that in +mind. + + + + + + +