Author: buildbot Date: Fri Apr 4 21:08:52 2014 New Revision: 904956 Log: Staging update by buildbot for accumulo
Modified: websites/staging/accumulo/trunk/content/ (props changed) websites/staging/accumulo/trunk/content/release_notes/1.6.0.html Propchange: websites/staging/accumulo/trunk/content/ ------------------------------------------------------------------------------ --- cms:source-revision (original) +++ cms:source-revision Fri Apr 4 21:08:52 2014 @@ -1 +1 @@ -1584908 +1584912 Modified: websites/staging/accumulo/trunk/content/release_notes/1.6.0.html ============================================================================== --- websites/staging/accumulo/trunk/content/release_notes/1.6.0.html (original) +++ websites/staging/accumulo/trunk/content/release_notes/1.6.0.html Fri Apr 4 21:08:52 2014 @@ -114,15 +114,17 @@ <p>One of the key elements of the Big Table design is use of the Log Structured Merge Tree (LSMT) concept. This entails sorting data in memory, writing out sorted files, and then later merging multiple sorted files into a single file. These automatic merges happen in the background and Accumulo decides when to merge files based comparing relative sizes of files to a compaction ratio. Adjusting the compaction ratio is the only way a user can control this process. <a href="https://issues.apache.org/jira/browse/ACCUMULO-1451" title="Make Compaction triggers extensible">ACCUMULO-1451</a> introduces pluggable compaction strategies which allow users to choose when and what files to compact. <a href="https://issues.apache.org/jira/browse/ACCUMULO-1808" title="Create compaction strategy that has size limit">ACCUMULO-1808</a> adds a compaction strategy the prevents compaction of files over a configurable size.</p> <h3 id="lexicoders">Lexicoders</h3> <p>Accumulo only sorts data lexicographically. Getting something like a pair of (<string>,<integer>) to sort correctly in Accumulo is tricky. Its tricky because you only want to compare the integers if the strings are equal. Its possible to make this sort properly in Accumulo if the data is encoded properly, but that's the tricky part. To make this easier <a href="https://issues.apache.org/jira/browse/ACCUMULO-1336" title="Add lexicoders from Typo to Accumulo">ACCUMULO-1336</a> added Lexicoders to the Accumulo API. Lexicoders provide an easy way to serialize data so that it sorts properly lexicographically. Below is a simple example.</p> -<blockquote> -<p>PairLexicoder plex = new PairLexicoder(new StringLexicoder(), new IntegerLexicoder()); -byte[] ba1 = plex.encode(new ComparablePair<String, Integer>("b",1)); -byte[] ba2 = plex.encode(new ComparablePair<String, Integer>("aa",1)); -byte[] ba3 = plex.encode(new ComparablePair<String, Integer>("a",2)); -byte[] ba4 = plex.encode(new ComparablePair<String, Integer>("a",1)); -byte[] ba5 = plex.encode(new ComparablePair<String, Integer>("aa",-3));</p> -<p>//sorting ba1,ba2,ba3,ba4, and ba5 lexicographically will result in the same order as sorting the ComparablePairs</p> -</blockquote> +<div class="codehilite"><pre> <span class="n">PairLexicoder</span> <span class="n">plex</span> <span class="p">=</span> <span class="n">new</span> <span class="n">PairLexicoder</span><span class="p">(</span><span class="n">new</span> <span class="n">StringLexicoder</span><span class="p">(),</span> <span class="n">new</span> <span class="n">IntegerLexicoder</span><span class="p">());</span> + <span class="n">byte</span><span class="p">[]</span> <span class="n">ba1</span> <span class="p">=</span> <span class="n">plex</span><span class="p">.</span><span class="n">encode</span><span class="p">(</span><span class="n">new</span> <span class="n">ComparablePair</span><span class="o"><</span><span class="n">String</span><span class="p">,</span> <span class="n">Integer</span><span class="o">></span><span class="p">(</span>"<span class="n">b</span>"<span class="p">,</span>1<span class="p">));</span> + <span class="n">byte</span><span class="p">[]</span> <span class="n">ba2</span> <span class="p">=</span> <span class="n">plex</span><span class="p">.</span><span class="n">encode</span><span class="p">(</span><span class="n">new</span> <span class="n">ComparablePair</span><span class="o"><</span><span class="n">String</span><span class="p">,</span> <span class="n">Integer</span><span class="o">></span><span class="p">(</span>"<span class="n">aa</span>"<span class="p">,</span>1<span class="p">));</span> + <span class="n">byte</span><span class="p">[]</span> <span class="n">ba3</span> <span class="p">=</span> <span class="n">plex</span><span class="p">.</span><span class="n">encode</span><span class="p">(</span><span class="n">new</span> <span class="n">ComparablePair</span><span class="o"><</span><span class="n">String</span><span class="p">,</span> <span class="n">Integer</span><span class="o">></span><span class="p">(</span>"<span class="n">a</span>"<span class="p">,</span>2<span class="p">));</span> + <span class="n">byte</span><span class="p">[]</span> <span class="n">ba4</span> <span class="p">=</span> <span class="n">plex</span><span class="p">.</span><span class="n">encode</span><span class="p">(</span><span class="n">new</span> <span class="n">ComparablePair</span><span class="o"><</span><span class="n">String</span><span class="p">,</span> <span class="n">Integer</span><span class="o">></span><span class="p">(</span>"<span class="n">a</span>"<span class="p">,</span>1<span class="p">));</span> + <span class="n">byte</span><span class="p">[]</span> <span class="n">ba5</span> <span class="p">=</span> <span class="n">plex</span><span class="p">.</span><span class="n">encode</span><span class="p">(</span><span class="n">new</span> <span class="n">ComparablePair</span><span class="o"><</span><span class="n">String</span><span class="p">,</span> <span class="n">Integer</span><span class="o">></span><span class="p">(</span>"<span class="n">aa</span>"<span class="p">,</span><span class="o">-</span>3<span class="p">));</span> + + <span class="o">//</span><span class="n">sorting</span> <span class="n">ba1</span><span class="p">,</span><span class="n">ba2</span><span class="p">,</span><span class="n">ba3</span><span class="p">,</span><span class="n">ba4</span><span class="p">,</span> <span class="n">and</span> <span class="n">ba5</span> <span class="n">lexicographically</span> <span class="n">will</span> <span class="n">result</span> <span class="n">in</span> <span class="n">the</span> <span class="n">same</span> <span class="n">order</span> <span class="n">as</span> <span class="n">sorting</span> <span class="n">the</span> <span class="n">ComparablePairs</span> +</pre></div> + + <h3 id="multi-table-accumulo-input-format">Multi-table Accumulo input format</h3> <p><a href="https://issues.apache.org/jira/browse/ACCUMULO-391" title="Multi-table input format">ACCUMULO-391</a> makes it possible to easily read from multiple tables in a Map Reduce job. TODO is there more to say about this, if not maybe move to one-liners.</p> <h3 id="locality-groups-in-memory">Locality groups in memory</h3>