1.6.0.html

buildbot Tue, 29 Apr 2014 11:24:56 -0700

Author: buildbot
Date: Tue Apr 29 18:24:04 2014
New Revision: 907368

Log:
Staging update by buildbot for accumulo


Modified:
    websites/staging/accumulo/trunk/content/   (props changed)
    websites/staging/accumulo/trunk/content/glossary.html
    websites/staging/accumulo/trunk/content/release_notes/1.6.0.html

Propchange: websites/staging/accumulo/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Tue Apr 29 18:24:04 2014
@@ -1 +1 @@
-1587633
+1591047

Modified: websites/staging/accumulo/trunk/content/glossary.html
==============================================================================
--- websites/staging/accumulo/trunk/content/glossary.html (original)
+++ websites/staging/accumulo/trunk/content/glossary.html Tue Apr 29 18:24:04 
2014
@@ -101,21 +101,21 @@
 <li><strong>iterator</strong> - a mechanism for modifying tablet-local 
portions of the key/value space. Iterators are used for standard administrative 
tasks as well as for custom processing.</li>
 <li><strong>iterator priority</strong> - an iterator must be configured with a 
particular scope and priority.  When a tablet server enters that scope, it will 
instantiate iterators in priority order starting from the smallest priority and 
ending with the largest, and apply each to the data read before rewriting the 
data or sending the data to the user.</li>
 <li><strong>iterator scopes</strong> - the possible scopes for iterators are 
where the tablet server is already reading and/or writing data: minor 
compaction / flush time (<em>minc</em> scope), major compaction / file merging 
time (<em>majc</em> scope), and query time (<em>scan</em> scope)</li>
-<li><strong>gc</strong> - </li>
+<li><strong>gc</strong> - process that identifies temporary files in HDFS that 
are no longer needed by any process, and deletes them.</li>
 <li><strong>key</strong> - the key into the distributed sorted map which is 
accumulo.  The key is subdivided into row, column, and timestamp.  The column 
is further divided into  family, qualifier, and visibility.</li>
 <li><strong>locality group</strong> - a set of column families that will be 
grouped together on disk.  With no locality groups configured, data is stored 
on disk in row order.  If each column family were configured to be its own 
locality group, the data for each column would be stored separately, in row 
order.  Configuring sets of columns into locality groups is a compromise 
between the two approaches and will improve performance when multiple columns 
are accessed in the same scan.</li>
 <li><strong>log-structured merge-tree</strong> - the sorting / flushing / 
merging scheme on which BigTable's design is based.</li>
-<li><strong>logger</strong> - </li>
+<li><strong>logger</strong> - in 1.4 and older, process that accepts updates 
to tablet servers and writes them to local on-disk storage for redundancy. in 
1.5 the functionality was subsumed by the tablet server and datanode with HDFS 
writes.</li>
 <li><strong>major compaction</strong> - merging multiple files into a single 
file.  If all of a tablet's files are merged into a single file, it is called a 
<em>full major compaction</em>.</li>
-<li><strong>master</strong> - </li>
+<li><strong>master</strong> - process that detects and responds to tablet 
failures, balances load across tablet servers by assigning and migrating 
tablets when required, coordinates table operations, and handles tablet server 
logistics (startup, shutdown, recovery).</li>
 <li><strong>minor compaction</strong> - flushing data from memory to disk.  
Usually this creates a new file for a tablet, but if the memory flushed is 
merge-sorted in with data from an existing file (replacing that file), it is 
called a <em>merging minor compaction</em>.</li>
-<li><strong>monitor</strong> -</li>
+<li><strong>monitor</strong> - process that displays status and usage 
information for all Accumulo components.</li>
 <li><strong>permissions</strong> - administrative abilities that must be given 
to a user such as creating tables or users and changing permissions or 
configuration parameters.</li>
 <li><strong>row</strong> - the portion of the key that is controls atomicity.  
Keys with the same row are guaranteed to remain on a single tablet hosted by a 
single tablet server, therefore multiple key/value pairs can be added to or 
removed from a row at the same time. The row is used for the primary sorting of 
the key.</li>
 <li><strong>scan</strong> - reading a range of key/value pairs.</li>
 <li><strong>tablet</strong> - a contiguous key range; the unit of work for a 
tablet server.</li>
 <li><strong>tablet servers</strong> - a set of servers that hosts reads and 
writes for tablets.  Each server hosts a distinct set of tablets at any given 
time, but the tablets may be hosted by different servers over time.</li>
-<li><strong>timestamp</strong> - the portion of the key that controls 
versioning.  Otherwise identical keys with differing timestamps are considered 
to be versions of a single <em>cell</em>.  Accumulo can be configured to keep 
the <em>N</em> newest versions of each <em>cell</em>.  When a deletion entry is 
inserted, it deletes all earlier versions for its cell.</li>
+<li><strong>timestamp</strong> - the portion of the key that controls 
versioning. Otherwise identical keys with differing timestamps are considered 
to be versions of a single <em>cell</em>.  Accumulo can be configured to keep 
the <em>N</em> newest versions of each <em>cell</em>.  When a deletion entry is 
inserted, it deletes all earlier versions for its cell.</li>
 <li><strong>value</strong> - immutable bytes associated with a particular 
key.</li>
 </ul>
   </div>

Modified: websites/staging/accumulo/trunk/content/release_notes/1.6.0.html
==============================================================================
--- websites/staging/accumulo/trunk/content/release_notes/1.6.0.html (original)
+++ websites/staging/accumulo/trunk/content/release_notes/1.6.0.html Tue Apr 29 
18:24:04 2014
@@ -95,7 +95,7 @@
 <p>Apache Accumulo 1.6.0 adds some major new features and fixes many bugs.  
This release contains changes from 609 issues contributed by 36 contributors 
and committers.  </p>
 <p>Accumulo 1.6.0 runs on Hadoop 1, however Hadoop 2 with HA namenode is 
recommended for production systems.  In addition to HA, Hadoop 2 also offers 
better data durability guarantees, in the case when nodes lose power, than 
Hadoop 1.</p>
 <h2 id="notable-improvements">Notable Improvements</h2>
-<h3 id="multiple-namenode-support">Multiple namenode support</h3>
+<h3 id="multiple-volume-support">Multiple volume support</h3>
 <p><a href="http://research.google.com/archive/bigtable.html";>BigTable's</a> 
design allows for its internal metadata to automatically spread across multiple 
nodes.  Accumulo has followed this design and scales very well as a result.  
There is one impediment to scaling though, and this is the HDFS namenode.  
There are two problems with the namenode when it comes to scaling.  First, the 
namenode stores all of its filesystem metadata in memory on a single machine.  
This introduces an upper bound on the number of files Accumulo can have.  
Second, there is an upper bound on the number of file operations per second 
that a single namenode can support.  For example, a namenode can only support a 
few thousand delete or create file request per second.  </p>
 <p>To overcome this bottleneck, support for multiple namenodes was added under 
<a href="https://issues.apache.org/jira/browse/ACCUMULO-118"; title="Multiple 
namenode support">ACCUMULO-118</a>.  This change allows Accumulo to store its 
files across multiple namenodes.  To use this feature, place comma separated 
list of namenode URIs in the new <em>instance.volumes</em> configuration 
property in accumulo-site.xml.  When upgrading to 1.6.0 and multiple namenode 
support is desired, modify this setting <strong>only</strong> after a 
successful upgrade.</p>
 <h3 id="table-namespaces">Table namespaces</h3>
@@ -181,6 +181,7 @@ issues such as JVM garbage collection pa
 <li><a href="https://issues.apache.org/jira/browse/ACCUMULO-2519"; title="FATE 
operation failed across upgrade">ACCUMULO-2519</a> FATE operation failed across 
upgrade</li>
 </ul>
 <h2 id="known-issues">Known Issues</h2>
+<h3 id="slower-writes-than-previous-accumulo-versions">Slower writes than 
previous Accumulo versions</h3>
 <p>When using Accumulo 1.6 and Hadoop 2, Accumulo will call hsync() on HDFS.
 Calling hsync improves durability by ensuring data is on disk (where other 
older 
 Hadoop versions might lose data in the face of power failure); however, calling
@@ -191,6 +192,14 @@ the number of concurrent writers to that
 50 concurrent writers would equate to approximately 200M of Java heap being 
used for
 mutation queues.</p>
 <p>For more information, see <a 
href="https://issues.apache.org/jira/browse/ACCUMULO-1950"; title="Reduce the 
number of calls to hsync">ACCUMULO-1950</a> and <a 
href="https://issues.apache.org/jira/browse/ACCUMULO-1905?focusedCommentId=13915208&amp;page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13915208";>this
 comment</a>.</p>
+<p>Another possible cause of slower writes is the change in write ahead log 
replication 
+between 1.4 and 1.5.  Accumulo 1.4. defaulted to two loggers servers.  
Accumulo 1.5 and 1.6 store 
+write ahead logs in HDFS and default to using three datanodes.  </p>
+<h3 id="batchwriter-hold-time-error">BatchWriter hold time error</h3>
+<p>If a <code>BatchWriter</code> fails with 
<code>MutationsRejectedException</code> and the  message contains
+<code>"# server errors 1"</code> then it may be <a 
href="https://issues.apache.org/jira/browse/ACCUMULO-2388";>ACCUMULO-2388</a>.  
To confirm this look in the tablet server logs 
+for <code>org.apache.accumulo.tserver.HoldTimeoutException</code> around the 
time the <code>BatchWriter</code> failed.
+If this is happening often a possible work around is to set 
<code>general.rpc.timeout</code> to <code>240s</code>.    </p>
 <h3 id="other-known-issues">Other known issues</h3>
 <ul>
 <li><a href="https://issues.apache.org/jira/browse/ACCUMULO-1507"; 
title="Dynamic Classloader still can't keep proper track of 
jars">ACCUMULO-1507</a> Dynamic Classloader still can't keep proper track of 
jars</li>

svn commit: r907368 - in /websites/staging/accumulo/trunk/content: ./ glossary.html release_notes/1.6.0.html

Reply via email to