[36/50] [abbrv] ACCUMULO-1327 converted latex manual to asciidoc

elserj Thu, 15 May 2014 09:50:11 -0700

http://git-wip-us.apache.org/repos/asf/accumulo/blob/900d6abb/docs/src/main/latex/accumulo_user_manual/chapters/table_design.tex
----------------------------------------------------------------------
diff --git a/docs/src/main/latex/accumulo_user_manual/chapters/table_design.tex 
b/docs/src/main/latex/accumulo_user_manual/chapters/table_design.tex
deleted file mode 100644
index ff1cebd..0000000
--- a/docs/src/main/latex/accumulo_user_manual/chapters/table_design.tex
+++ /dev/null
@@ -1,343 +0,0 @@
-
-% Licensed to the Apache Software Foundation (ASF) under one or more
-% contributor license agreements. See the NOTICE file distributed with
-% this work for additional information regarding copyright ownership.
-% The ASF licenses this file to You under the Apache License, Version 2.0
-% (the "License"); you may not use this file except in compliance with
-% the License. You may obtain a copy of the License at
-%
-%     http://www.apache.org/licenses/LICENSE-2.0
-%
-% Unless required by applicable law or agreed to in writing, software
-% distributed under the License is distributed on an "AS IS" BASIS,
-% WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-% See the License for the specific language governing permissions and
-% limitations under the License.
-
-\chapter{Table Design}
-
-\section{Basic Table}
-
-Since Accumulo tables are sorted by row ID, each table can be thought of as 
being
-indexed by the row ID. Lookups performed by row ID can be executed quickly, by 
doing
-a binary search, first across the tablets, and then within a tablet. Clients 
should
-choose a row ID carefully in order to support their desired application. A 
simple rule
-is to select a unique identifier as the row ID for each entity to be stored 
and assign
-all the other attributes to be tracked to be columns under this row ID. For 
example,
-if we have the following data in a comma-separated file:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-    userid,age,address,account-balance
-\end{verbatim}\endgroup
-
-We might choose to store this data using the userid as the rowID, the column
-name in the column family, and a blank column qualifier:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-Mutation m = new Mutation(userid);
-final String column_qualifier = "";
-m.put("age", column_qualifier, age);
-m.put("address", column_qualifier, address);
-m.put("balance", column_qualifier, account_balance);
-
-writer.add(m);
-\end{verbatim}\endgroup
-
-We could then retrieve any of the columns for a specific userid by specifying 
the
-userid as the range of a scanner and fetching specific columns:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-Range r = new Range(userid, userid); // single row
-Scanner s = conn.createScanner("userdata", auths);
-s.setRange(r);
-s.fetchColumnFamily(new Text("age"));
-
-for(Entry<Key,Value> entry : s)
-    System.out.println(entry.getValue().toString());
-\end{verbatim}\endgroup
-
-\section{RowID Design}
-
-Often it is necessary to transform the rowID in order to have rows ordered in 
a way
-that is optimal for anticipated access patterns. A good example of this is 
reversing
-the order of components of internet domain names in order to group rows of the
-same parent domain together:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-com.google.code
-com.google.labs
-com.google.mail
-com.yahoo.mail
-com.yahoo.research
-\end{verbatim}\endgroup
-
-Some data may result in the creation of very large rows - rows with many 
columns.
-In this case the table designer may wish to split up these rows for better load
-balancing while keeping them sorted together for scanning purposes. This can be
-done by appending a random substring at the end of the row:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-com.google.code_00
-com.google.code_01
-com.google.code_02
-com.google.labs_00
-com.google.mail_00
-com.google.mail_01
-\end{verbatim}\endgroup
-
-It could also be done by adding a string representation of some period of time 
such as date to the week
-or month:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-com.google.code_201003
-com.google.code_201004
-com.google.code_201005
-com.google.labs_201003
-com.google.mail_201003
-com.google.mail_201004
-\end{verbatim}\endgroup
-
-Appending dates provides the additional capability of restricting a scan to a 
given
-date range.
-
-\section{Lexicoders}
-Since Keys in Accumulo are sorted lexicographically by default, it's often 
useful to encode
-common data types into a byte format in which their sort order corresponds to 
the sort order
-in their native form. An example of this is encoding dates and numerical data 
so that they can
-be better seeked or searched in ranges.
-
-The lexicoders are a standard and extensible way of encoding Java types. 
Here's an example
-of a lexicoder that encodes a java Date object so that it sorts 
lexicographically:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-// create new date lexicoder
-DateLexicoder dateEncoder = new DateLexicoder();
-
-// truncate time to hours
-long epoch = System.currentTimeMillis();
-Date hour = new Date(epoch - (epoch % 3600000));
-
-// encode the rowId so that it is sorted lexicographically
-Mutation mutation = new Mutation(dateEncoder.encode(hour));
-mutation.put(new Text("colf"), new Text("colq"), new Value(new byte[]{}));
-\end{verbatim}\endgroup
-
-If we want to return the most recent date first, we can reverse the sort order
-with the reverse lexicoder:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-// create new date lexicoder and reverse lexicoder
-DateLexicoder dateEncoder = new DateLexicoder();
-ReverseLexicoder reverseEncoder = new ReverseLexicoder(dateEncoder);
-
-// truncate date to hours
-long epoch = System.currentTimeMillis();
-Date hour = new Date(epoch - (epoch % 3600000));
-
-// encode the rowId so that it sorts in reverse lexicographic order
-Mutation mutation = new Mutation(reverseEncoder.encode(hour));
-mutation.put(new Text("colf"), new Text("colq"), new Value(new byte[]{}));
-\end{verbatim}\endgroup
-
-
-\section{Indexing}
-In order to support lookups via more than one attribute of an entity, 
additional
-indexes can be built. However, because Accumulo tables can support any number 
of
-columns without specifying them beforehand, a single additional index will 
often
-suffice for supporting lookups of records in the main table. Here, the index 
has, as
-the rowID, the Value or Term from the main table, the column families are the 
same,
-and the column qualifier of the index table contains the rowID from the main 
table.
-
-\begin{center}
-$\begin{array}{|c|c|c|c|c|c|} \hline
-\multicolumn{5}{|c|}{\mbox{Key}} & \multirow{3}{*}{\mbox{Value}}\\ \cline{1-5}
-\multirow{2}{*}{\mbox{Row ID}}& \multicolumn{3}{|c|}{\mbox{Column}} & 
\multirow{2}{*}{\mbox{Timestamp}} & \\ \cline{2-4}
-& \mbox{Family} & \mbox{Qualifier} & \mbox{Visibility} & & \\ \hline \hline
-\mbox{Term} & \mbox{Field Name} & \mbox{MainRowID} & & &\\ \hline
-\end{array}$
-\end{center}
-
-Note: We store rowIDs in the column qualifier rather than the Value so that we 
can
-have more than one rowID associated with a particular term within the index. 
If we
-stored this in the Value we would only see one of the rows in which the value
-appears since Accumulo is configured by default to return the one most recent
-value associated with a key.
-
-Lookups can then be done by scanning the Index Table first for occurrences of 
the
-desired values in the columns specified, which returns a list of row ID from 
the main
-table. These can then be used to retrieve each matching record, in their 
entirety, or a
-subset of their columns, from the Main Table.
-
-To support efficient lookups of multiple rowIDs from the same table, the 
Accumulo
-client library provides a BatchScanner. Users specify a set of Ranges to the
-BatchScanner, which performs the lookups in multiple threads to multiple 
servers
-and returns an Iterator over all the rows retrieved. The rows returned are NOT 
in
-sorted order, as is the case with the basic Scanner interface.
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-// first we scan the index for IDs of rows matching our query
-
-Text term = new Text("mySearchTerm");
-
-HashSet<Range> matchingRows = new HashSet<Range>();
-
-Scanner indexScanner = createScanner("index", auths);
-indexScanner.setRange(new Range(term, term));
-
-// we retrieve the matching rowIDs and create a set of ranges
-for(Entry<Key,Value> entry : indexScanner)
-    matchingRows.add(new Range(entry.getKey().getColumnQualifier()));
-
-// now we pass the set of rowIDs to the batch scanner to retrieve them
-BatchScanner bscan = conn.createBatchScanner("table", auths, 10);
-
-bscan.setRanges(matchingRows);
-bscan.fetchColumnFamily(new Text("attributes"));
-
-for(Entry<Key,Value> entry : bscan)
-    System.out.println(entry.getValue());
-\end{verbatim}\endgroup
-
-One advantage of the dynamic schema capabilities of Accumulo is that different
-fields may be indexed into the same physical table. However, it may be 
necessary to
-create different index tables if the terms must be formatted differently in 
order to
-maintain proper sort order. For example, real numbers must be formatted
-differently than their usual notation in order to be sorted correctly. In 
these cases,
-usually one index per unique data type will suffice.
-
-\section{Entity-Attribute and Graph Tables}
-
-Accumulo is ideal for storing entities and their attributes, especially of the
-attributes are sparse. It is often useful to join several datasets together on 
common
-entities within the same table. This can allow for the representation of 
graphs,
-including nodes, their attributes, and connections to other nodes.
-
-Rather than storing individual events, Entity-Attribute or Graph tables store
-aggregate information about the entities involved in the events and the
-relationships between entities. This is often preferrable when single events 
aren't
-very useful and when a continuously updated summarization is desired.
-
-The physical schema for an entity-attribute or graph table is as follows:
-
-\begin{center}
-$\begin{array}{|c|c|c|c|c|c|} \hline
-\multicolumn{5}{|c|}{\mbox{Key}} & \multirow{3}{*}{\mbox{Value}}\\ \cline{1-5}
-\multirow{2}{*}{\mbox{Row ID}}& \multicolumn{3}{|c|}{\mbox{Column}} & 
\multirow{2}{*}{\mbox{Timestamp}} & \\ \cline{2-4}
-& \mbox{Family} & \mbox{Qualifier} & \mbox{Visibility} & & \\ \hline \hline
-\mbox{EntityID} & \mbox{Attribute Name} & \mbox{Attribute Value} & & & 
\mbox{Weight} \\ \hline
-\mbox{EntityID} & \mbox{Edge Type} & \mbox{Related EntityID} & & & 
\mbox{Weight} \\ \hline
-\end{array}$
-\end{center}
-
-For example, to keep track of employees, managers and products the following
-entity-attribute table could be used. Note that the weights are not always 
necessary
-and are set to 0 when not used.
-
-$\begin{array}{llll}
-\bf{RowID} & \bf{Column Family} & \bf{Column Qualifier} & \bf{Value} \\
-\\
-E001 & name & bob & 0 \\
-E001 & department & sales & 0 \\
-E001 & hire\_date & 20030102 & 0 \\
-E001 & units\_sold & P001 & 780 \\
-\\
-E002 & name & george & 0 \\
-E002 & department & sales & 0 \\
-E002 & manager\_of & E001 & 0 \\
-E002 & manager\_of & E003 & 0 \\
-\\
-E003 & name & harry & 0 \\
-E003 & department & accounts\_recv & 0 \\
-E003 & hire\_date & 20000405 & 0 \\
-E003 & units\_sold & P002 & 566 \\
-E003 & units\_sold & P001 & 232 \\
-\\
-P001 & product\_name & nike\_airs & 0 \\
-P001 & product\_type & shoe & 0 \\
-P001 & in\_stock & germany & 900 \\
-P001 & in\_stock & brazil & 200 \\
-\\
-P002 & product\_name & basic\_jacket & 0 \\
-P002 & product\_type & clothing & 0 \\
-P002 & in\_stock & usa & 3454 \\
-P002 & in\_stock & germany & 700 \\
-\end{array}$
-\vspace{5mm}
-
-To allow efficient updating of edge weights, an aggregating iterator can be
-configured to add the value of all mutations applied with the same key. These 
types
-of tables can easily be created from raw events by simply extracting the 
entities,
-attributes, and relationships from individual events and inserting the keys 
into
-Accumulo each with a count of 1. The aggregating iterator will take care of
-maintaining the edge weights.
-
-\section{Document-Partitioned Indexing}
-
-Using a simple index as described above works well when looking for records 
that
-match one of a set of given criteria. When looking for records that match more 
than
-one criterion simultaneously, such as when looking for documents that contain 
all of
-the words `the' and `white' and `house', there are several issues.
-
-First is that the set of all records matching any one of the search terms must 
be sent
-to the client, which incurs a lot of network traffic. The second problem is 
that the
-client is responsible for performing set intersection on the sets of records 
returned
-to eliminate all but the records matching all search terms. The memory of the 
client
-may easily be overwhelmed during this operation.
-
-For these reasons Accumulo includes support for a scheme known as sharded
-indexing, in which these set operations can be performed at the TabletServers 
and
-decisions about which records to include in the result set can be made without
-incurring network traffic.
-
-This is accomplished via partitioning records into bins that each reside on at 
most
-one TabletServer, and then creating an index of terms per record within each 
bin as
-follows:
-
-\begin{center}
-$\begin{array}{|c|c|c|c|c|c|} \hline
-\multicolumn{5}{|c|}{\mbox{Key}} & \multirow{3}{*}{\mbox{Value}}\\ \cline{1-5}
-\multirow{2}{*}{\mbox{Row ID}}& \multicolumn{3}{|c|}{\mbox{Column}} & 
\multirow{2}{*}{\mbox{Timestamp}} & \\ \cline{2-4}
-& \mbox{Family} & \mbox{Qualifier} & \mbox{Visibility} & & \\ \hline \hline
-\mbox{BinID} & \mbox{Term} & \mbox{DocID} & & & \mbox{Weight} \\ \hline
-\end{array}$
-\end{center}
-
-Documents or records are mapped into bins by a user-defined ingest 
application. By
-storing the BinID as the RowID we ensure that all the information for a 
particular
-bin is contained in a single tablet and hosted on a single TabletServer since
-Accumulo never splits rows across tablets. Storing the Terms as column families
-serves to enable fast lookups of all the documents within this bin that 
contain the
-given term.
-
-Finally, we perform set intersection operations on the TabletServer via a 
special
-iterator called the Intersecting Iterator. Since documents are partitioned 
into many
-bins, a search of all documents must search every bin. We can use the 
BatchScanner
-to scan all bins in parallel. The Intersecting Iterator should be enabled on a
-BatchScanner within user query code as follows:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-Text[] terms = {new Text("the"), new Text("white"), new Text("house")};
-
-BatchScanner bs = conn.createBatchScanner(table, auths, 20);
-IteratorSetting iter = new IteratorSetting(20, "ii", 
IntersectingIterator.class);
-IntersectingIterator.setColumnFamilies(iter, terms);
-bs.addScanIterator(iter);
-bs.setRanges(Collections.singleton(new Range()));
-
-for(Entry<Key,Value> entry : bs) {
-    System.out.println(" " + entry.getKey().getColumnQualifier());
-}
-\end{verbatim}\endgroup
-
-This code effectively has the BatchScanner scan all tablets of a table, 
looking for
-documents that match all the given terms. Because all tablets are being 
scanned for
-every query, each query is more expensive than other Accumulo scans, which
-typically involve a small number of TabletServers. This reduces the number of
-concurrent queries supported and is subject to what is known as the `straggler'
-problem in which every query runs as slow as the slowest server participating.
-
-Of course, fast servers will return their results to the client which can 
display them
-to the user immediately while they wait for the rest of the results to arrive. 
If the
-results are unordered this is quite effective as the first results to arrive 
are as good
-as any others to the user.
-


http://git-wip-us.apache.org/repos/asf/accumulo/blob/900d6abb/docs/src/main/latex/accumulo_user_manual/chapters/troubleshooting.tex
----------------------------------------------------------------------
diff --git 
a/docs/src/main/latex/accumulo_user_manual/chapters/troubleshooting.tex 
b/docs/src/main/latex/accumulo_user_manual/chapters/troubleshooting.tex
deleted file mode 100644
index 203fe0c..0000000
--- a/docs/src/main/latex/accumulo_user_manual/chapters/troubleshooting.tex
+++ /dev/null
@@ -1,794 +0,0 @@
-
-% Licensed to the Apache Software Foundation (ASF) under one or more
-% contributor license agreements. See the NOTICE file distributed with
-% this work for additional information regarding copyright ownership.
-% The ASF licenses this file to You under the Apache License, Version 2.0
-% (the "License"); you may not use this file except in compliance with
-% the License. You may obtain a copy of the License at
-%
-%     http://www.apache.org/licenses/LICENSE-2.0
-%
-% Unless required by applicable law or agreed to in writing, software
-% distributed under the License is distributed on an "AS IS" BASIS,
-% WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-% See the License for the specific language governing permissions and
-% limitations under the License.
-
-\chapter{Troubleshooting}
-
-\section{Logs}
-
-Q. The tablet server does not seem to be running!? What happened?
-
-Accumulo is a distributed system.  It is supposed to run on remote
-equipment, across hundreds of computers.  Each program that runs on
-these remote computers writes down events as they occur, into a local
-file. By default, this is defined in
-\texttt{\$ACCUMULO\_HOME}/conf/accumule-env.sh as ACCUMULO\_LOG\_DIR.
-
-A. Look in the \texttt{\$ACCUMULO\_LOG\_DIR}/tserver*.log file.  Specifically, 
check the end of the file.
-
-Q. The tablet server did not start and the debug log does not exists!  What 
happened?
-
-When the individual programs are started, the stdout and stderr output
-of these programs are stored in ``.out'' and ``.err'' files in
-\texttt{\$ACCUMULO\_LOG\_DIR}.  Often, when there are missing configuration
-options, files or permissions, messages will be left in these files.
-
-A. Probably a start-up problem.  Look in 
\texttt{\$ACCUMULO\_LOG\_DIR}/tserver*.err
-
-\section{Monitor}
-
-Q. Accumulo is not working, what's wrong?
-
-There's a small web server that collects information about all the
-components that make up a running Accumulo instance. It will highlight
-unusual or unexpected conditions.
-
-A. Point your browser to the monitor (typically the master host, on port 
50095).  Is anything red or yellow?
-
-Q. My browser is reporting connection refused, and I cannot get to the monitor
-
-The monitor program's output is also written to .err and .out files in
-the \texttt{\$ACCUMULO\_LOG\_DIR}. Look for problems in this file if the
-\texttt{\$ACCUMULO\_LOG\_DIR/monitor*.log} file does not exist.
-
-A. The monitor program is probably not running.  Check the log files for 
errors.
-
-Q. My browser hangs trying to talk to the monitor.
-
-Your browser needs to be able to reach the monitor program.  Often
-large clusters are firewalled, or use a VPN for internal
-communications. You can use SSH to proxy your browser to the cluster,
-or consult with your system administrator to gain access to the server
-from your browser.
-
-It is sometimes helpful to use a text-only browser to sanity-check the
-monitor while on the machine running the monitor:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-  $ links http://localhost:50095
-\end{verbatim}\endgroup
-
-A. Verify that you are not firewalled from the monitor if it is running on a 
remote host.
-
-Q. The monitor responds, but there are no numbers for tservers and tables.  
The summary page says the master is down.
-
-The monitor program gathers all the details about the master and the
-tablet servers through the master. It will be mostly blank if the
-master is down.
-
-A. Check for a running master.
-
-\section{HDFS}
-
-Accumulo reads and writes to the Hadoop Distributed File System.
-Accumulo needs this file system available at all times for normal operations.
-
-Q. Accumulo is having problems ``getting a block blk\_1234567890123.'' How do 
I fix it?
-
-This troubleshooting guide does not cover HDFS, but in general, you
-want to make sure that all the datanodes are running and an fsck check
-finds the file system clean:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-  $ hadoop fsck /accumulo
-\end{verbatim}\endgroup
-
-You can use:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-  $ hadoop fsck /accumulo/path/to/corrupt/file -locations -blocks -files
-\end{verbatim}\endgroup
-
-to locate the block references of individual corrupt files and use those
-references to search the name node and individual data node logs to determine 
which 
-servers those blocks have been assigned and then try to fix any underlying file
-system issues on those nodes.
-
-On a larger cluster, you may need to increase the number of Xceivers
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-  <property>
-    <name>dfs.datanode.max.xcievers</name>
-    <value>4096</value>
-  </property>
-\end{verbatim}\endgroup
-
-A. Verify HDFS is healthy, check the datanode logs.
-
-\section{Zookeeper}
-
-Q. \texttt{accumulo init} is hanging.  It says something about talking to 
zookeeper.
-
-Zookeeper is also a distributed service.  You will need to ensure that
-it is up.  You can run the zookeeper command line tool to connect to
-any one of the zookeeper servers:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-  $ zkCli.sh -server zoohost
-...
-[zk: zoohost:2181(CONNECTED) 0] 
-\end{verbatim}\endgroup
-
-It is important to see the word \texttt{CONNECTED}!  If you only see
-\texttt{CONNECTING} you will need to diagnose zookeeper errors.
-
-A. Check to make sure that zookeeper is up, and that
-\texttt{\$ACCUMULO\_HOME/conf/accumulo-site.xml} has been pointed to
-your zookeeper server(s).
-
-Q. Zookeeper is running, but it does not say \texttt{CONNECTED}
-
-Zookeeper processes talk to each other to elect a leader.  All updates
-go through the leader and propagate to a majority of all the other
-nodes.  If a majority of the nodes cannot be reached, zookeeper will
-not allow updates.  Zookeeper also limits the number connections to a
-server from any other single host.  By default, this limit can be as small as 
10 
-and can be reached in some everything-on-one-machine test configurations.
-
-You can check the election status and connection status of clients by
-asking the zookeeper nodes for their status.  You connect to zookeeper
-and ask it with the four-letter ``stat'' command:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-$ nc zoohost 2181
-stat
-Zookeeper version: 3.4.5-1392090, built on 09/30/2012 17:52 GMT
-Clients:
- /127.0.0.1:58289[0](queued=0,recved=1,sent=0)
- /127.0.0.1:60231[1](queued=0,recved=53910,sent=53915)
-
-Latency min/avg/max: 0/5/3008
-Received: 1561459
-Sent: 1561592
-Connections: 2
-Outstanding: 0
-Zxid: 0x621a3b
-Mode: standalone
-Node count: 22524
-$
-\end{verbatim}\endgroup
-
-
-A. Check zookeeper status, verify that it has a quorum, and has not exceeded 
maxClientCnxns.
-
-Q. My tablet server crashed!  The logs say that it lost it's zookeeper lock.
-
-Tablet servers reserve a lock in zookeeper to maintain their ownership
-over the tablets that have been assigned to them.  Part of their
-responsibility for keeping the lock is to send zookeeper a keep-alive
-message periodically.  If the tablet server fails to send a message in
-a timely fashion, zookeeper will remove the lock and notify the tablet
-server.  If the tablet server does not receive a message from
-zookeeper, it will assume its lock has been lost, too.  If a tablet
-server loses its lock, it kills itself: everything assumes it is dead
-already.
-
-A. Investigate why the tablet server did not send a timely message to
-zookeeper.
-
-\subsection{Keeping the tablet server lock}
-
-Q. My tablet server lost its lock.  Why?
-
-The primary reason a tablet server loses its lock is that it has been pushed 
into swap.
-
-A large java program (like the tablet server) may have a large portion
-of its memory image unused.  The operation system will favor pushing
-this allocated, but unused memory into swap so that the memory can be
-re-used as a disk buffer.  When the java virtual machine decides to
-access this memory, the OS will begin flushing disk buffers to return that
-memory to the VM.  This can cause the entire process to block long
-enough for the zookeeper lock to be lost.
-
-A. Configure your system to reduce the kernel parameter ``swappiness'' from 
the default (60) to zero.
-
-Q. My tablet server lost its lock, and I have already set swappiness to
-zero.  Why?
-
-Be careful not to over-subscribe memory.  This can be easy to do if
-your accumulo processes run on the same nodes as hadoop's map-reduce
-framework.  Remember to add up:
-
-\begin{itemize}
-\item{size of the JVM for the tablet server}
-\item{size of the in-memory map, if using the native map implementation}
-\item{size of the JVM for the data node}
-\item{size of the JVM for the task tracker}
-\item{size of the JVM times the maximum number of mappers and reducers}
-\item{size of the kernel and any support processes}
-\end{itemize}
-
-If a 16G node can run 2 mappers and 2 reducers, and each can be 2G,
-then there is only 8G for the data node, tserver, task tracker and OS.
-
-A. Reduce the memory footprint of each component until it fits comfortably.
-
-Q. My tablet server lost its lock, swappiness is zero, and my node has lots of 
unused memory!
-
-The JVM memory garbage collector may fall behind and cause a
-``stop-the-world'' garbage collection. On a large memory virtual
-machine, this collection can take a long time.  This happens more
-frequently when the JVM is getting low on free memory.  Check the logs
-of the tablet server.  You will see lines like this:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-2013-06-20 13:43:20,607 [tabletserver.TabletServer] DEBUG: gc 
ParNew=0.00(+0.00) secs
-    ConcurrentMarkSweep=0.00(+0.00) secs freemem=1,868,325,952(+1,868,325,952) 
totalmem=2,040,135,680
-\end{verbatim}\endgroup
-
-When ``freemem'' becomes small relative to the amount of memory
-needed, the JVM will spend more time finding free memory than
-performing work.  This can cause long delays in sending keep-alive
-messages to zookeeper.
-
-A. Ensure the tablet server JVM is not running low on memory.
-
-\section{Tools}
-
-The accumulo script can be used to run classes from the command line.
-This section shows how a few of the utilities work, but there are many
-more.
-
-There's a class that will examine an accumulo storage file and print
-out basic metadata.  
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-$ ./bin/accumulo org.apache.accumulo.core.file.rfile.PrintInfo 
/accumulo/tables/1/default_tablet/A000000n.rf
-2013-07-16 08:17:14,778 [util.NativeCodeLoader] INFO : Loaded the 
native-hadoop library
-Locality group         : <DEFAULT>
-        Start block          : 0
-        Num   blocks         : 1
-        Index level 0        : 62 bytes  1 blocks
-        First key            : 288be9ab4052fe9e 
span:34078a86a723e5d3:3da450f02108ced5 [] 1373373521623 false
-        Last key             : start:13fc375709e id:615f5ee2dd822d7a [] 
1373373821660 false
-        Num entries          : 466
-        Column families      : [waitForCommits, start, md major compactor 1, 
md major compactor 2, md major compactor 3,
-                                 bringOnline, prep, md major compactor 4, md 
major compactor 5, md root major compactor 3,
-                                 minorCompaction, wal, compactFiles, md root 
major compactor 4, md root major compactor 1,
-                                 md root major compactor 2, compact, id, 
client:update, span, update, commit, write,
-                                 majorCompaction]
-
-Meta block     : BCFile.index
-      Raw size             : 4 bytes
-      Compressed size      : 12 bytes
-      Compression type     : gz
-
-Meta block     : RFile.index
-      Raw size             : 780 bytes
-      Compressed size      : 344 bytes
-      Compression type     : gz
-\end{verbatim}\endgroup
-
-When trying to diagnose problems related to key size, the PrintInfo tool can 
provide a histogram of the individual key sizes:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-$ ./bin/accumulo org.apache.accumulo.core.file.rfile.PrintInfo --histogram 
/accumulo/tables/1/default_tablet/A000000n.rf
-...
-Up to size      count      %-age
-         10 :        222  28.23%
-        100 :        244  71.77%
-       1000 :          0   0.00%
-      10000 :          0   0.00%
-     100000 :          0   0.00%
-    1000000 :          0   0.00%
-   10000000 :          0   0.00%
-  100000000 :          0   0.00%
- 1000000000 :          0   0.00%
-10000000000 :          0   0.00%
-\end{verbatim}\endgroup
-
-Likewise, PrintInfo will dump the key-value pairs and show you the contents of 
the RFile:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-$ ./bin/accumulo org.apache.accumulo.core.file.rfile.PrintInfo --dump 
/accumulo/tables/1/default_tablet/A000000n.rf
-row columnFamily:columnQualifier [visibility] timestamp deleteFlag -> Value
-...
-\end{verbatim}\endgroup
-
-Q. Accumulo is not showing me any data!
-
-A. Do you have your auths set so that it matches your visibilities?
-
-Q. What are my visibilities?
-
-A. Use ``PrintInfo'' on a representative file to get some idea of the 
visibilities in the underlying data.
-
-Note that the use of PrintInfo is an administrative tool and can only
-by used by someone who can access the underlying Accumulo data. It
-does not provide the normal access controls in Accumulo.
-
-If you would like to backup, or otherwise examine the contents of Zookeeper, 
there are commands to dump and load to/from XML.
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-$ ./bin/accumulo org.apache.accumulo.server.util.DumpZookeeper --root 
/accumulo >dump.xml
-$ ./bin/accumulo org.apache.accumulo.server.util.RestoreZookeeper --overwrite 
< dump.xml
-\end{verbatim}\endgroup
-
-Q. How can I get the information in the monitor page for my cluster monitoring 
system?
-
-A. Use GetMasterStats:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-$ ./bin/accumulo org.apache.accumulo.test.GetMasterStats | grep Load
- OS Load Average: 0.27
-\end{verbatim}\endgroup
-
-Q. The monitor page is showing an offline tablet.  How can I find out which 
tablet it is?
-
-A. Use FindOfflineTablets:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-$ ./bin/accumulo org.apache.accumulo.server.util.FindOfflineTablets
-2<<@(null,null,localhost:9997) is UNASSIGNED  #walogs:2
-\end{verbatim}\endgroup
-
-Here's what the output means:
-
-\begin{enumerate}
-\item{\texttt{2<<} This is the tablet from (-inf, +inf) for the
-  table with id 2.  ``tables -l'' in the shell will show table ids for
-  tables.}
-\item{@(null, null, localhost:9997)} Location information.  The
-  format is \texttt{@(assigned, hosted, last)}.  In this case, the
-  tablet has not been assigned, is not hosted anywhere, and was once
-  hosted on localhost.
-\item{\#walogs:2} The number of write-ahead logs that this tablet requires for 
recovery.
-\end{enumerate}
-
-An unassigned tablet with write-ahead logs is probably waiting for
-logs to be sorted for efficient recovery.
-
-Q. How can I be sure that the metadata tables are up and consistent?
-
-A. \texttt{CheckForMetadataProblems} will verify the start/end of
-every tablet matches, and the start and stop for the table is empty:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-$ ./bin/accumulo org.apache.accumulo.server.util.CheckForMetadataProblems -u 
root --password
-Enter the connection password: 
-All is well for table !0
-All is well for table 1
-\end{verbatim}\endgroup
-
-Q. My hadoop cluster has lost a file due to a NameNode failure.  How can I 
remove the file?
-
-A. There's a utility that will check every file reference and ensure
-that the file exists in HDFS.  Optionally, it will remove the
-reference:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-$ ./bin/accumulo org.apache.accumulo.server.util.RemoveEntriesForMissingFiles 
-u root --password
-Enter the connection password: 
-2013-07-16 13:10:57,293 [util.RemoveEntriesForMissingFiles] INFO : File 
/accumulo/tables/2/default_tablet/F0000005.rf
- is missing
-2013-07-16 13:10:57,296 [util.RemoveEntriesForMissingFiles] INFO : 1 files of 
3 missing
-\end{verbatim}\endgroup
-
-Q. I have many entries in zookeeper for old instances I no longer need.  How 
can I remove them?
-
-A. Use CleanZookeeper:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-$ ./bin/accumulo org.apache.accumulo.server.util.CleanZookeeper
-\end{verbatim}\endgroup
-
-This command will not delete the instance pointed to by the local 
\texttt{conf/accumulo-site.xml} file.
-
-Q. I need to decommission a node.  How do I stop the tablet server on it?
-
-A. Use the admin command:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-$ ./bin/accumulo admin stop hostname:9997
-2013-07-16 13:15:38,403 [util.Admin] INFO : Stopping server 12.34.56.78:9997
-\end{verbatim}\endgroup
-
-Q. I cannot login to a tablet server host, and the tablet server will not shut 
down.  How can I kill the server?
-
-A. Sometimes you can kill a ``stuck'' tablet server by deleting it's lock in 
zookeeper:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-$ ./bin/accumulo org.apache.accumulo.server.util.TabletServerLocks --list
-                  127.0.0.1:9997 TSERV_CLIENT=127.0.0.1:9997
-$ ./bin/accumulo org.apache.accumulo.server.util.TabletServerLocks -delete 
127.0.0.1:9997
-$ ./bin/accumulo org.apache.accumulo.server.util.TabletServerLocks -list
-                  127.0.0.1:9997             null
-\end{verbatim}\endgroup
-
-You can find the master and instance id for any accumulo instances using the 
same zookeeper instance:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-$ ./bin/accumulo org.apache.accumulo.server.util.ListInstances
-INFO : Using ZooKeepers localhost:2181
-
- Instance Name       | Instance ID                          | Master           
             
----------------------+--------------------------------------+-------------------------------
-              "test" | 6140b72e-edd8-4126-b2f5-e74a8bbe323b |                
127.0.0.1:9999
-\end{verbatim}\endgroup
-
-\section{System Metadata Tables}
-\label{sec:metadata}
-
-Accumulo tracks information about tables in metadata tables. The metadata for
-most tables is contained within the metadata table in the accumulo namespace,
-while metadata for that table is contained in the root table in the accumulo
-namespace. The root table is composed of a single tablet, which does not
-split, so it is also called the root tablet. Information about the root
-table, such as its location and write-ahead logs, are stored in ZooKeeper.
-
-Let's create a table and put some data into it:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-shell> createtable test
-shell> tables -l
-accumulo.metadata    =>        !0
-accumulo.root        =>        +r
-test                 =>         2
-trace                =>         1
-shell> insert a b c d
-shell> flush -w
-\end{verbatim}\endgroup
-
-Now let's take a look at the metadata for this table:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-shell> table accumulo.metadata
-shell> scan -b 3; -e 3<
-3< file:/default_tablet/F000009y.rf []    186,1
-3< last:13fe86cd27101e5 []    127.0.0.1:9997
-3< loc:13fe86cd27101e5 []    127.0.0.1:9997
-3< log:127.0.0.1+9997/0cb7ce52-ac46-4bf7-ae1d-acdcfaa97995 []    
127.0.0.1+9997/0cb7ce52-ac46-4bf7-ae1d-acdcfaa97995|6
-3< srv:dir []    /default_tablet
-3< srv:flush []    1
-3< srv:lock []    tservers/127.0.0.1:9997/zlock-0000000001$13fe86cd27101e5
-3< srv:time []    M1373998392323
-3< ~tab:~pr []    \x00
-\end{verbatim}\endgroup
-
-Let's decode this little session:
-
-\begin{enumerate}
-\item{\texttt{scan -b 3; -e 3<}\\
-  Every tablet gets its own row. Every row starts with the table id followed by
-  ``;'' or ``<'', and followed by the end row split point for that tablet.}
-\item{\texttt{file:/default\_tablet/F000009y.rf [] 186,1}\\
-  File entry for this tablet.  This tablet contains a single file reference. 
The
-  file is ``/accumulo/tables/3/default\_tablet/F000009y.rf''.  It contains 1
-  key/value pair, and is 186 bytes long.}
-\item{\texttt{last:13fe86cd27101e5 []    127.0.0.1:9997}\\
-  Last location for this tablet.  It was last held on 127.0.0.1:9997, and the
-  unique tablet server lock data was ``13fe86cd27101e5''. The default balancer
-  will tend to put tablets back on their last location.}
-\item{\texttt{loc:13fe86cd27101e5 []    127.0.0.1:9997}\\
-  The current location of this tablet.}
-\item{\texttt{log:127.0.0.1+9997/0cb7ce52-ac46-4bf7-ae1d-acdcfaa97995 []    
127.0. ...}\\
-  This tablet has a reference to a single write-ahead log. This file can be 
found in\\
-  /accumulo/wal/127.0.0.1+9997/0cb7ce52-ac46-4bf7-ae1d-acdcfaa97995. The value
-  of this entry could refer to multiple files. This tablet's data is encoded as
-  ``6'' within the log.}
-\item{\texttt{srv:dir []    /default\_tablet}\\
-  Files written for this tablet will be placed into
-  /accumulo/tables/3/default\_tablet.}
-\item{\texttt{srv:flush []    1}\\
-  Flush id.  This table has successfully completed the flush with the id of
-  ``1''.}
-\item{\texttt{srv:lock []    
tservers/127.0.0.1:9997/zlock-0000000001\$13fe86cd27101e5}\\
-  This is the lock information for the tablet holding the present lock.  This
-  information is checked against zookeeper whenever this is updated, which
-  prevents a metadata update from a tablet server that no longer holds its
-  lock.}
-\item{\texttt{srv:time []    M1373998392323} }
-\item{\texttt{\textasciitilde{}tab:\textasciitilde{}pr []    
\textbackslash{}x00}\\
-  The end-row marker for the previous tablet (prev-row).  The first byte
-  indicates the presence of a prev-row.  This tablet has the range (-inf, 
+inf),
-  so it has no prev-row (or end row).}
-\end{enumerate}
-
-Besides these columns, you may see:
-
-\begin{enumerate}
-\item{\texttt{rowId future:zooKeeperID location} Tablet has been assigned to a 
tablet, but not yet loaded.}
-\item{\texttt{\textasciitilde{}del:filename} When a tablet server is done use 
a file, it will create a delete marker in the appropriate metadata table, 
unassociated with any tablet.  The garbage collector will remove the marker, 
and the file, when no other reference to the file exists.}
-\item{\texttt{\textasciitilde{}blip:txid} Bulk-Load In Progress marker}
-\item{\texttt{rowId loaded:filename} A file has been bulk-loaded into this 
tablet, however the bulk load has not yet completed on other tablets, so this 
is marker prevents the file from being loaded multiple times.}
-\item{\texttt{rowId !cloned} A marker that indicates that this tablet has been 
successfully cloned.}
-\item{\texttt{rowId splitRatio:ratio} A marker that indicates a split is in 
progress, and the files are being split at the given ratio.}
-\item{\texttt{rowId chopped} A marker that indicates that the files in the 
tablet do not contain keys outside the range of the tablet.}
-\item{\texttt{rowId scan} A marker that prevents a file from being removed 
while there are still active scans using it.}
-
-\end{enumerate}
-
-\section{Simple System Recovery}
-
-Q. One of my Accumulo processes died. How do I bring it back?
-
-The easiest way to bring all services online for an Accumulo instance is to 
run the ``start-all.sh`` script.
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-  $ bin/start-all.sh
-\end{verbatim}\endgroup
-
-This process will check the process listing, using ``jps`` on each host before 
attempting to restart a service on the given host.
-Typically, this check is sufficient except in the face of a hung/zombie 
process. For large clusters, it may be
-undesirable to ssh to every node in the cluster to ensure that all hosts are 
running the appropriate processes and ``start-here.sh`` may be of use.
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-  $ ssh host_with_dead_process
-  $ bin/start-here.sh
-\end{verbatim}\endgroup
-
-``start-here.sh`` should be invoked on the host which is missing a given 
process. Like start-all.sh, it will start all
-necessary processes that are not currently running, but only on the current 
host and not cluster-wide. Tools such as ``pssh`` or 
-``pdsh`` can be used to automate this process.
-
-``start-server.sh`` can also be used to start a process on a given host; 
however, it is not generally recommended for
-users to issue this directly as the ``start-all.sh`` and ``start-here.sh`` 
scripts provide the same functionality with
-more automation and are less prone to user error.
-
-A. Use ``start-all.sh`` or ``start-here.sh``.
-
-Q. My process died again. Should I restart it via ``cron`` or tools like 
``supervisord``?
-
-A. A repeatedly dying Accumulo process is a sign of a larger problem. 
Typically these problems are due to a
-misconfiguration of Accumulo or over-saturation of resources. Blind automation 
of any service restart inside of Accumulo
-is generally an undesirable situation as it is indicative of a problem that is 
being masked and ignored. Accumulo
-processes should be stable on the order of months and not require frequent 
restart.
-
-
-\section{Advanced System Recovery}
-
-\subsection{HDFS Failure}
-Q. I had disasterous HDFS failure.  After bringing everything back up, several 
tablets refuse to go online.
-
-Data written to tablets is written into memory before being written into 
indexed files.  In case the server
-is lost before the data is saved into a an indexed file, all data stored in 
memory is first written into a
-write-ahead log (WAL).  When a tablet is re-assigned to a new tablet server, 
the write-ahead logs are read to
-recover any mutations that were in memory when the tablet was last hosted.
-
-If a write-ahead log cannot be read, then the tablet is not re-assigned.  All 
it takes is for one of
-the blocks in the write-ahead log to be missing.  This is unlikely unless 
multiple data nodes in HDFS have been
-lost.
-
-A. Get the WAL files online and healthy.  Restore any data nodes that may be 
down.
-
-Q. How do find out which tablets are offline?
-
-A. Use ``accumulo admin checkTablets''
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-  $ bin/accumulo admin checkTablets
-\end{verbatim}\endgroup
-
-Q. I lost three data nodes, and I'm missing blocks in a WAL.  I don't care 
about data loss, how
-can I get those tablets online?
-
-See the discussion in section~\ref{sec:metadata}, which shows a typical 
metadata table listing.  
-The entries with a column family of ``log'' are references to the WAL for that 
tablet. 
-If you know what WAL is bad, you can find all the references with a grep in 
the shell:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-shell> grep 0cb7ce52-ac46-4bf7-ae1d-acdcfaa97995
-3< log:127.0.0.1+9997/0cb7ce52-ac46-4bf7-ae1d-acdcfaa97995 []    
127.0.0.1+9997/0cb7ce52-ac46-4bf7-ae1d-acdcfaa97995|6
-\end{verbatim}\endgroup
-
-A. You can remove the WAL references in the metadata table.
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-shell> grant -u root Table.WRITE -t accumulo.metadata
-shell> delete 3< log 127.0.0.1+9997/0cb7ce52-ac46-4bf7-ae1d-acdcfaa97995
-\end{verbatim}\endgroup
-
-Note: the colon (``:'') is omitted when specifying the ``row cf cq'' for the 
delete command.
-
-The master will automatically discover the tablet no longer has a bad WAL 
reference and will
-assign the tablet.  You will need to remove the reference from all the tablets 
to get them 
-online.
-
-
-Q. The metadata (or root) table has references to a corrupt WAL.
-
-This is a much more serious state, since losing updates to the metadata table 
will result
-in references to old files which may not exist, or lost references to new 
files, resulting
-in tablets that cannot be read, or large amounts of data loss.
-
-The best hope is to restore the WAL by fixing HDFS data nodes and bringing the 
data back online.
-If this is not possible, the best approach is to re-create the instance and 
bulk import all files from
-the old instance into a new tables.
-
-A complete set of instructions for doing this is outside the scope of this 
guide,
-but the basic approach is:
-
-\begin{itemize}
- \item Use ``tables -l'' in the shell to discover the table name to table id 
mapping
- \item Stop all accumulo processes on all nodes
- \item Move the accumulo directory in HDFS out of the way:
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
- $ hadoop fs -mv /accumulo /corrupt
-\end{verbatim}\endgroup
- \item Re-initalize accumulo
- \item Recreate tables, users and permissions
- \item Import the directories under \texttt{/corrupt/tables/<id>} into the new 
instance
-\end{itemize}
-
-Q. One or more HDFS Files under /accumulo/tables are corrupt
-
-Accumulo maintains multiple references into the tablet files in the METADATA
-table and within the tablet server hosting the file, this makes it difficult to
-reliably just remove those references.
-
-The directory structure in HDFS for tables will follow the general structure:
-
-\small
-\begin{verbatim}
-  /accumulo
-  /accumulo/tables/
-  /accumulo/tables/!0
-  /accumulo/tables/!0/default_tablet/A000001.rf
-  /accumulo/tables/!0/t-00001/A000002.rf
-  /accumulo/tables/1
-  /accumulo/tables/1/default_tablet/A000003.rf
-  /accumulo/tables/1/t-00001/A000004.rf
-  /accumulo/tables/1/t-00001/A000005.rf
-  /accumulo/tables/2/default_tablet/A000006.rf
-  /accumulo/tables/2/t-00001/A000007.rf
-\end{verbatim}
-\normalsize
-
-If files under /accumulo/tables are corrupt, the best course of action is to
-recover those files in hdsf see the section on HDFS. Once these recovery 
efforts
-have been exhausted, the next step depends on where the missing file(s) are
-located. Different actions are required when the bad files are in Accumulo data
-table files or if they are metadata table files.
-
-{\bf Data File Corruption}
-
-When an Accumulo data file is corrupt, the most reliable way to restore 
Accumulo
-operations is to replace the missing file with an âemptyâ file so that
-references to the file in the METADATA table and within the tablet server
-hosting the file can be resolved by Accumulo. An empty file can be created 
using
-the CreateEmpty utiity:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-  $accumulo org.apache.accumulo.core.file.rfile.CreateEmpty 
/path/to/empty/file/empty.rf
-\end{verbatim}\endgroup
-
-The process is to delete the corrupt file and then move the empty file into its
-place (The generated empty file can be copied and used multiple times if 
necessary and does not need
-to be regenerated each time)
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-  $hadoop fs ârm /accumulo/tables/corrupt/file/thename.rf; \
-  hadoop fs -mv /path/to/empty/file/empty.rf 
/accumulo/tables/corrupt/file/thename.rf
-\end{verbatim}\endgroup
-
-{\bf Metadata File Corruption}
-
-If the corrupt files are metadata files, see \ref{sec:metadata} (under the path
-\begin{verbatim}/accumulo/tables/!0\end{verbatim}) then you will need to 
rebuild
-the metadata table by initializing a new instance of Accumulo and then 
importing
-all of the existing data into the new instance.  This is the same procedure as
-recovering from a zookeeper failure (see \ref{ZooKeeper Failure}, except that
-you will have the benefit of having the existing user and table authorizations
-that are maintained in zookeeper.
-
-You can use the DumpZookeeper utility to save this information for reference
-before creating the new instance.  You will not be able to use RestoreZookeeper
-because the table names and references are likely to be different between the
-original and the new instances, but it can serve as a reference.
-
-A. If the files cannot be recovered, replace corrupt data files with a empty
-rfiles to allow references in the metadata table and in the tablet servers to 
be
-resolved. Rebuild the metadata table if the corrupt files are metadata files.
-
-\subsection{ZooKeeper Failure}
-Q. I lost my ZooKeeper quorum (hardware failure), but HDFS is still intact. 
How can I recover my Accumulo instance?
-
-ZooKeeper, in addition to its lock-service capabilities, also serves to 
bootstrap an Accumulo
-instance from some location in HDFS. It contains the pointers to the root 
tablet in HDFS which
-is then used to load the Accumulo metadata tablets, which then loads all user 
tables. ZooKeeper
-also stores all namespace and table configuration, the user database, the 
mapping of table IDs to 
-table names, and more across Accumulo restarts.
-
-Presently, the only way to recover such an instance is to initialize a new 
instance and import all
-of the old data into the new instance. The easiest way to tackle this problem 
is to first recreate
-the mapping of table ID to table name and then recreate each of those tables 
in the new instance. 
-Set any necessary configuration on the new tables and add some split points to 
the tables to close 
-the gap between how many splits the old table had and no splits.
-
-The directory structure in HDFS for tables will follow the general structure:
-
-\small
-\begin{verbatim}
-  /accumulo
-  /accumulo/tables/
-  /accumulo/tables/1
-  /accumulo/tables/1/default_tablet/A000001.rf
-  /accumulo/tables/1/t-00001/A000002.rf
-  /accumulo/tables/1/t-00001/A000003.rf
-  /accumulo/tables/2/default_tablet/A000004.rf
-  /accumulo/tables/2/t-00001/A000005.rf
-\end{verbatim}
-\normalsize
-
-For each table, make a new directory that you can move (or copy if you have 
the HDFS space to do so)
-all of the rfiles for a given table into. For example, to process the table 
with an ID of ``1``, make a new directory, 
-say ``/new-table-1`` and then copy all files from 
``/accumulo/tables/1/*/*.rf`` into that directory. Additionally,
-make a directory, ``/new-table-1-failures``, for any failures during the 
import process. Then, issue the import
-command using the Accumulo shell into the new table, telling Accumulo to not 
re-set the timestamp:
-
-\small
-\begin{verbatim}
-user@instance new_table> importdirectory /new-table-1 /new-table-1-failures 
false
-\end{verbatim}
-\normalsize
-
-Any RFiles which were failed to be loaded will be placed in 
``/new-table-1-failures``. Rfiles that were successfully
-imported will no longer exist in ``/new-table-1``. For failures, move them 
back to the import directory and retry
-the ``importdirectory`` command.
-
-It is \textbf{extremely} important to note that this approach may introduce 
stale data back into
-the tables. For a few reasons, RFiles may exist in the table directory which 
are candidates for deletion but have
-not yet been deleted. Additionally, deleted data which was not compacted away, 
but still exists in write-ahead logs if
-the original instance was somehow recoverable, will be re-introduced in the 
new instance. Table splits and merges
-(which also include the deleteRows API call on TableOperations, are also 
vulnerable to this problem. This process should
-\textbf{not} be used if these are unacceptable risks. It is possible to try to 
re-create a view of the ``accumulo.metadata``
-table to prune out files that are candidates for deletion, but this is a 
difficult task that also may not be entirely accurate.
-
-Likewise, it is also possible that data loss may occur from write-ahead log 
(WAL) files which existed on the old table but
-were not minor-compacted into an RFile. Again, it may be possible to 
reconstruct the state of these WAL files to
-replay data not yet in an RFile; however, this is a difficult task and is not 
implemented in any automated fashion.
-
-A. The ``importdirectory`` shell command can be used to import RFiles from the 
old instance into a newly created instance,
-but extreme care should go into the decision to do this as it may result in 
reintroduction of stale data or the
-omission of new data.
-
-\section{File Naming Conventions}
-
-Q. Why are files named like they are? Why do some start with ``C'' and others 
with ``F''?
-
-A. The file names give you a basic idea for the source of the file.
-
-The base of the filename is a base-36 unique number. All filenames in accumulo 
are coordinated 
-with a counter in zookeeper, so they are always unique, which is useful for 
debugging.
-
-The leading letter gives you an idea of how the file was created:
-
-\begin{itemize}
- \item F - Flush: entries in memory were written to a file (Minor Compaction)
- \item M - Merging compaction: entries in memory were combined with the 
smallest file to create one new file
- \item C - Several files, but not all files, were combined to produce this 
file (Major Compaction)
- \item A - All files were compacted, delete entries were dropped
- \item I - Bulk import, complete, sorted index files. Always in a directory 
starting with "b-"
-\end{itemize}
-
-This simple file naming convention allows you to see the basic structure of 
the files from just 
-their filenames, and reason about what should be happening to them next, just
-by scanning their entries in the metadata tables.
-
-For example, if you see multiple files with ``M'' prefixes, the tablet is, or 
was, up against it's
-maximum file limit, so it began merging memory updates with files to keep the 
file count reasonable.  This
-slows down ingest performance, so knowing there are many files like this tells 
you that the system
-is struggling to keep up with ingest vs the compaction strategy which reduces 
the number of files.
-

http://git-wip-us.apache.org/repos/asf/accumulo/blob/900d6abb/docs/src/main/latex/accumulo_user_manual/images/data_distribution.png
----------------------------------------------------------------------
diff --git 
a/docs/src/main/latex/accumulo_user_manual/images/data_distribution.png 
b/docs/src/main/latex/accumulo_user_manual/images/data_distribution.png
deleted file mode 100644
index 7f18d3f..0000000
Binary files 
a/docs/src/main/latex/accumulo_user_manual/images/data_distribution.png and 
/dev/null differ

http://git-wip-us.apache.org/repos/asf/accumulo/blob/900d6abb/docs/src/main/latex/accumulo_user_manual/images/failure_handling.png
----------------------------------------------------------------------
diff --git 
a/docs/src/main/latex/accumulo_user_manual/images/failure_handling.png 
b/docs/src/main/latex/accumulo_user_manual/images/failure_handling.png
deleted file mode 100644
index c131de6..0000000
Binary files 
a/docs/src/main/latex/accumulo_user_manual/images/failure_handling.png and 
/dev/null differ

http://git-wip-us.apache.org/repos/asf/accumulo/blob/900d6abb/pom.xml
----------------------------------------------------------------------
diff --git a/pom.xml b/pom.xml
index 78af9c5..2b92402 100644
--- a/pom.xml
+++ b/pom.xml
@@ -230,7 +230,7 @@
         <artifactId>accumulo-docs</artifactId>
         <version>${project.version}</version>
         <classifier>user-manual</classifier>
-        <type>pdf</type>
+        <type>html</type>
       </dependency>
       <dependency>
         <groupId>org.apache.accumulo</groupId>
@@ -594,6 +594,11 @@
           </configuration>
         </plugin>
         <plugin>
+          <groupId>org.asciidoctor</groupId>
+          <artifactId>asciidoctor-maven-plugin</artifactId>
+          <version>0.1.4</version>
+        </plugin>
+        <plugin>
           <groupId>org.codehaus.mojo</groupId>
           <artifactId>build-helper-maven-plugin</artifactId>
           <version>1.8</version>
@@ -621,11 +626,6 @@
           <version>1.2.1</version>
         </plugin>
         <plugin>
-          <groupId>org.codehaus.mojo</groupId>
-          <artifactId>latex-maven-plugin</artifactId>
-          <version>1.1</version>
-        </plugin>
-        <plugin>
           <groupId>org.eclipse.m2e</groupId>
           <artifactId>lifecycle-mapping</artifactId>
           <version>1.0.0</version>

[36/50] [abbrv] ACCUMULO-1327 converted latex manual to asciidoc

Reply via email to