ACCUMULO-612 added information on monitor page to user manual
Project: http://git-wip-us.apache.org/repos/asf/accumulo/repo Commit: http://git-wip-us.apache.org/repos/asf/accumulo/commit/6c765032 Tree: http://git-wip-us.apache.org/repos/asf/accumulo/tree/6c765032 Diff: http://git-wip-us.apache.org/repos/asf/accumulo/diff/6c765032 Branch: refs/heads/ACCUMULO-378 Commit: 6c76503279096780e6b9b6ef974e7eec181cae29 Parents: f783b4b Author: Billie Rinaldi <bil...@apache.org> Authored: Thu Jun 12 17:25:52 2014 -0700 Committer: Billie Rinaldi <bil...@apache.org> Committed: Thu Jun 12 17:25:52 2014 -0700 ---------------------------------------------------------------------- .../main/asciidoc/chapters/administration.txt | 49 ++++++++++++++++++-- docs/src/main/asciidoc/chapters/design.txt | 2 +- 2 files changed, 46 insertions(+), 5 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/accumulo/blob/6c765032/docs/src/main/asciidoc/chapters/administration.txt ---------------------------------------------------------------------- diff --git a/docs/src/main/asciidoc/chapters/administration.txt b/docs/src/main/asciidoc/chapters/administration.txt index b4c1a71..28030cb 100644 --- a/docs/src/main/asciidoc/chapters/administration.txt +++ b/docs/src/main/asciidoc/chapters/administration.txt @@ -214,8 +214,8 @@ Make sure ZooKeeper is configured and running on at least one machine in the cluster. Start Accumulo using the +bin/start-all.sh+ script. -To verify that Accumulo is running, check the Status page as described under -_Monitoring_. In addition, the Shell can provide some information about the status of +To verify that Accumulo is running, check the Status page as described in +<<monitoring>>. In addition, the Shell can provide some information about the status of tables via reading the metadata tables. ==== Stopping Accumulo @@ -254,12 +254,53 @@ account for the removal of these hosts. Bear in mind that the monitor will not r slaves file automatically, so it will report the decomissioned servers as down; it's recommended that you restart the monitor so that the node list is up to date. +[[monitoring]] === Monitoring The Accumulo Master provides an interface for monitoring the status and health of -Accumulo components. This interface can be accessed by pointing a web browser to -+http://accumulomaster:50095/status+ +Accumulo components. The Accumulo Monitor provides a web UI for accessing this information at ++http://_monitorhost_:50095/+. +Things highlighted in yellow may be in need of attention. +If anything is highlighted in red on the monitor page, it is something that definitely needs attention. + +The Overview page contains some summary information about the Accumulo instance, including the version, instance name, and instance ID. +There is a table labeled Accumulo Master with current status, a table listing the active Zookeeper servers, and graphs displaying various metrics over time. +These include ingest and scan performance and other useful measurements. + +The Master Server, Tablet Servers, and Tables pages display metrics grouped in different ways (e.g. by tablet server or by table). +Metrics typically include number of entries (key/value pairs), ingest and query rates. +The number of running scans, major and minor compactions are in the form _number_running_ (_number_queued_). +Another important metric is hold time, which is the amount of time a tablet has been waiting but unable to flush its memory in a minor compaction. + +The Server Activity page graphically displays tablet server status, with each server represented as a circle or square. +Different metrics may be assigned to the nodes' color and speed of oscillation. +The Overall Avg metric is only used on the Server Activity page, and represents the average of all the other metrics (after normalization). +Similarly, the Overall Max metric picks the metric with the maximum normalized value. + +The Garbage Collector page displays a list of garbage collection cycles, the number of files found of each type (including deletion candidates in use and files actually deleted), and the length of the deletion cycle. +The Traces page displays data for recent traces performed (see the following section for information on <<tracing>>). +The Recent Logs page displays warning and error logs forwarded to the monitor from all Accumulo processes. +Also, the XML and JSON links provide metrics in XML and JSON formats, respectively. + +==== SSL +SSL may be enabled for the monitor page by setting the following properties in the +accumulo-site.xml+ file: + + monitor.ssl.keyStore + monitor.ssl.keyStorePassword + monitor.ssl.trustStore + monitor.ssl.trustStorePassword + +If the Accumulo conf directory has been configured (in particular the +accumulo-env.sh+ file must be set up), the +generate_monitor_certificate.sh+ script in the Accumulo +bin+ directory can be used to create the keystore and truststore files with random passwords. +The script will print out the properties that need to be added to the +accumulo-site.xml+ file. +The stores can also be generated manually with the Java +keytool+ command, whose usage can be seen in the +generate_monitor_certificate.sh+ script. + +If SSL is enabled, the monitor URL can only be accessed via https. +This also allows you to access the Accumulo shell through the monitor page. +The left navigation bar will have a new link to Shell. +An Accumulo user name and password must be entered for access to the shell. + +[[tracing]] === Tracing It can be difficult to determine why some operations are taking longer than expected. For example, you may be looking up items with very low http://git-wip-us.apache.org/repos/asf/accumulo/blob/6c765032/docs/src/main/asciidoc/chapters/design.txt ---------------------------------------------------------------------- diff --git a/docs/src/main/asciidoc/chapters/design.txt b/docs/src/main/asciidoc/chapters/design.txt index f57afc3..f975029 100644 --- a/docs/src/main/asciidoc/chapters/design.txt +++ b/docs/src/main/asciidoc/chapters/design.txt @@ -93,7 +93,7 @@ the state of an instance. The Monitor shows graphs and tables which contain info about read/write rates, cache hit/miss rates, and Accumulo table information such as scan rate and active/queued compactions. Additionally, the Monitor should always be the first point of entry when attempting to debug an Accumulo problem as it will show high-level problems -in addition to aggregated errors from all nodes in the cluster. See the section on Monitoring +in addition to aggregated errors from all nodes in the cluster. See the section on <<monitoring>> for more information. Multiple Monitors can be run to provide hot-standby support in the face of failure. Due to the