Author: kturner
Date: Fri Apr  4 21:03:50 2014
New Revision: 1584908

URL: http://svn.apache.org/r1584908
Log:
ACCUMULO-2396 checkin of WIP 1.6.0 release notes

Added:
    accumulo/site/trunk/content/release_notes/1.6.0.mdtext   (with props)

Added: accumulo/site/trunk/content/release_notes/1.6.0.mdtext
URL: 
http://svn.apache.org/viewvc/accumulo/site/trunk/content/release_notes/1.6.0.mdtext?rev=1584908&view=auto
==============================================================================
--- accumulo/site/trunk/content/release_notes/1.6.0.mdtext (added)
+++ accumulo/site/trunk/content/release_notes/1.6.0.mdtext Fri Apr  4 21:03:50 
2014
@@ -0,0 +1,238 @@
+Title:
+Notice:    Licensed to the Apache Software Foundation (ASF) under one
+           or more contributor license agreements.  See the NOTICE file
+           distributed with this work for additional information
+           regarding copyright ownership.  The ASF licenses this file
+           to you under the Apache License, Version 2.0 (the
+           "License"); you may not use this file except in compliance
+           with the License.  You may obtain a copy of the License at
+           .
+             http://www.apache.org/licenses/LICENSE-2.0
+           .
+           Unless required by applicable law or agreed to in writing,
+           software distributed under the License is distributed on an
+           "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+           KIND, either express or implied.  See the License for the
+           specific language governing permissions and limitations
+           under the License.
+
+**DRAFT 1.6.0 RELEASE NOTES**
+
+Apache Accumulo 1.6.0
+
+This document is a work in progress.
+
+## Notable Improvements
+
+### Multiple namenode support
+
+BigTable's design allow's for its internal metadata to automatically spread 
across multiple nodes.  Accumulo has followed this design and scales very well 
as a result.  There is one impediment to scaling though, and this is the HDFS 
namenode.  There are two problems with the namenode when it comes to scaling.  
First, the namenode stores all of its filesystem metadata in memory on a single 
machine.  This introduces an upper bound on the number of files Accumulo can 
have.  Second, there is an upper bound on the number of file operations per 
second that a single namenode can support.  For example a namenode can only 
support a few thousand delete or create file request per second.  
+
+To overcome this bottleneck support for multiple namenodes was added under 
[ACCUMULO-118][ACCUMULO-118].  This change allows Accumulo to store its files 
across multiple namenodes.  To use this feature place comma separated list of 
namenode URIs in the new instance.volumes configuration property.  Modify this 
setting after a successful upgrade.
+
+### Table namespaces
+
+Administering an Accumulo instance with lots of tables is cumbersome.  To ease 
this [ACCUMULO-802][ACCUMULO-802] introduced table namespaces which allow 
tables to be grouped.  This allows configuration and permission changes to made 
to a namespace, which will apply to all of its tables.  Example use cases are 
... TODO
+
+### Conditional Mutations
+
+Accumulo has not offered a way to make atomic row changes until now.  Accumulo 
now supports atomic test and set row operations.  
[ACCUMULO-1000][ACCUMULO-1000] added conditional mutations and a conditional 
writer.  A conditional mutation has tests on columns that must pass before any 
changes are made.  These test are executed in server processes while a row lock 
is held.  Below is a simple example of making atomic row changes using 
conditional mutations.
+
+ 1. Read columns X,Y,SEQ into a,b,s from row R1 using an isolated scanner.
+ 2. For row R1 write conditional mutation X=f(a),Y=g(b),SEQ=s+1 if SEQ==s.
+ 3. If conditional mutation failed, then goto step 1.
+
+The only built in test that conditional mutations support are equality and 
isNull.  However, iterators can be configured on a conditional mutation to run 
before these test.  This makes it possible to implement any number of test such 
as less than, greater than, contains, etc.
+
+### Encryption
+
+Support for encrypting Accumulo's persistent and over the wire data was added. 
  [ACCUMULO-998][ACCUMULO-998], [ACCUMULO-958][ACCUMULO-958], and 
[ACCUMULO-980][ACCUMULO-980] cover encrypting data at rest in write ahead logs 
and rfiles.   [ACCUMULO-1009][ACCUMULO-1009] covers encrypting data over the 
wire using SSL.  
+
+### Pluggable compaction strategies
+
+One of the key elements of the Big Table design is use of the Log Structured 
Merge Tree (LSMT) concept.  This entails sorting data in memory, writing out 
sorted files, and then later merging multiple sorted files into a single file.  
 These automatic merges happen in the background and Accumulo decides when to 
merge files based comparing relative sizes of files to a compaction ratio.  
Adjusting the compaction ratio is the only way a user can control this process. 
 [ACCUMULO-1451][ACCUMULO-1451] introduces pluggable compaction strategies 
which allow users to choose when and what files to compact.  
[ACCUMULO-1808][ACCUMULO-1808] adds a compaction strategy the prevents 
compaction of files over a configurable size.
+
+### Lexicoders
+
+Accumulo only sorts data lexicographically.  Getting something like a pair of 
(<string>,<integer>) to sort correctly in Accumulo is tricky.  Its tricky 
because you only want to compare the integers if the strings are equal.  Its 
possible to make this sort properly in Accumulo if the data is encoded 
properly, but that's the tricky part.  To make this easier 
[ACCUMULO-1336][ACCUMULO-1336] added Lexicoders to the Accumulo API.  
Lexicoders provide an easy way to serialize data so that it sorts properly 
lexicographically.  Below is a simple example.
+
+ > PairLexicoder plex = new PairLexicoder(new StringLexicoder(), new 
IntegerLexicoder());
+ > byte[] ba1 = plex.encode(new ComparablePair<String, Integer>("b",1));
+ > byte[] ba2 = plex.encode(new ComparablePair<String, Integer>("aa",1));
+ > byte[] ba3 = plex.encode(new ComparablePair<String, Integer>("a",2));
+ > byte[] ba4 = plex.encode(new ComparablePair<String, Integer>("a",1)); 
+ > byte[] ba5 = plex.encode(new ComparablePair<String, Integer>("aa",-3));
+ >
+ > //sorting ba1,ba2,ba3,ba4, and ba5 lexicographically will result in the 
same order as sorting the ComparablePairs
+
+### Multi-table Accumulo input format
+
+[ACCUMULO-391][ACCUMULO-391] makes it possible to easily read from multiple 
tables in a Map Reduce job.  TODO is there more to say about this, if not maybe 
move to one-liners.
+
+### Locality groups in memory
+
+In cases where a very small amount of data is stored in a locality group one 
would expect fast scans over that locality group.  However this was not always 
the case because recently written data stored in memory was not partitioned by 
locality group.  Therefore if a table had 100GB of data in memory and 1MB of 
that was in locality group A, then scanning A would have required reading all 
100GB.  [ACCUMULO-112][ACCUMULO-112] changes this and partitions data by 
locality group as its written.
+
+### Jline2 support in shell
+
+[ACCUMULO-1442][ACCUMULO-1442] TODO whats some of the goodness this brings to 
the shell?
+
+### Service IP addresses
+
+Previous versions of Accumulo always used IP addresses internally.  This could 
be problematic in virtual machine environments where IP addresses change.  In 
[ACCUMULO-1585][ACCUMULO-1585] this was changed, now the accumulo uses the 
exact hostnames from its config files for internal addressing.  
+
+All Accumulo processes running on a cluster are locatable via zookeeper.  
Therefore using well known ports is not really required.  
[ACCUMULO-1664][ACCUMULO-1664] makes it possible to for all Accumulo processes 
to use random ports.  This makes it easier to run multiple Accumulo processes 
on a single node.   
+
+### Other notable changes
+
+ * [ACCUMULO-842][ACCUMULO-842] Added FATE administration to shell
+ * [ACCUMULO-1481][ACCUMULO-1481] The root tablet is now the root table.
+ * [ACCUMULO-1566][ACCUMULO-1566] When read-ahead starts in the scanner is now 
configurable.
+ * [ACCUMULO-1667][ACCUMULO-1667] Added a synchronous version of online and 
offline table
+ * [ACCUMULO-2128][ACCUMULO-2128] Provide resource cleanup via static utility
+
+## Notable Bug Fixes
+
+TODO kturner looked at bugs w/ fix version of 1.6.0 and a non-empty affects 
version and selected ones he thought were relevant to users.... need others 
devs to do this
+TODO some bugs may be unintelligible to end users... either improve the issue 
descritpion or remove from list
+
+ * [ACCUMULO-324][ACCUMULO-324] System/site constraints and iterators should 
NOT affect the METADATA table
+ * [ACCUMULO-335][ACCUMULO-335] Batch scanning over the !METADATA table can 
cause issues
+ * [ACCUMULO-1018][ACCUMULO-1018] Client does not give informative message 
when user can not read table
+ * [ACCUMULO-1492][ACCUMULO-1492] bin/accumulo should follow symbolic links
+ * [ACCUMULO-1572][ACCUMULO-1572] Single node zookeeper failure kills 
connected accumulo servers
+ * [ACCUMULO-1661][ACCUMULO-1661] AccumuloInputFormat cannot fetch empty 
column family
+ * [ACCUMULO-1696][ACCUMULO-1696] Deep copy in the compaction scope iterators 
can throw off the stats
+ * [ACCUMULO-1698][ACCUMULO-1698] stop-here doesn't consider system hostname
+ * [ACCUMULO-1833][ACCUMULO-1833] MultiTableBatchWriterImpl.getBatchWriter() 
is not performant for multiple threads
+ * [ACCUMULO-1901][ACCUMULO-1901] start-here.sh starts only one GC process 
even if more are defined
+ * [ACCUMULO-1921][ACCUMULO-1921] NPE in tablet assignment
+ * [ACCUMULO-1994][ACCUMULO-1994] Proxy does not handle Key timestamps 
correctly
+ * [ACCUMULO-2174][ACCUMULO-2174] VFS Classloader has potential to collide 
localized resources
+ * [ACCUMULO-2225][ACCUMULO-2225] Need to better handle DNS failure 
propagation from Hadoop
+ * [ACCUMULO-2234][ACCUMULO-2234] Cannot run offline mapreduce over 
non-default instance.dfs.dir value
+ * [ACCUMULO-2334][ACCUMULO-2334] Lacking fallback when ACCUMULO_LOG_HOST 
isn't set
+ * [ACCUMULO-2408][ACCUMULO-2408] metadata table not assigned after root table 
is loaded
+ * [ACCUMULO-2519][ACCUMULO-2519] FATE operation failed across upgrade
+
+## Known Issues
+
+When using Accumulo 1.6 and Hadoop 2, Accumulo will call hsync() on HDFS.
+Calling hsync improves durability by ensuring data is on disk (where other 
older 
+Hadoop versions might lose data in the face of power failure); however, calling
+hsync frequently does noticeably slow writes. A simple work around is to 
increase 
+the value of the tserver.mutation.queue.max configuration parameter via 
accumulo-site.xml.
+
+A value of "4M" is a better recommendation, and memory consumption will 
increase by
+the number of concurrent writers to that TabletServer. For example, a value of 
4M with
+50 concurrent writers would equate to approximately 200M of Java heap being 
used for
+mutation queues.
+
+For more information, see [ACCUMULO-1950][ACCUMULO-1950] and [this 
comment][ACCUMULO-1905-comment].
+
+### Other known issues
+
+ * [ACCUMULO-1507][ACCUMULO-1507] Dynamic Classloader still can't keep proper 
track of jars
+ * [ACCUMULO-1588][ACCUMULO-1588] Monitor XML and JSON differ
+ * [ACCUMULO-1628][ACCUMULO-1628] NPE on deep copied dumped memory iterator
+ * [ACCUMULO-1708][ACCUMULO-1708] [ACCUMULO-2495][ACCUMULO-2495] Out of memory 
errors do not always kill tservers leading to unexpected behavior
+ * [ACCUMULO-2008][ACCUMULO-2008] Block cache reserves section for in-memory 
blocks
+ * [ACCUMULO-2059][ACCUMULO-2059] Namespace constraints easily get clobbered 
by table constraints
+
+TODO look for other known issues
+
+## Documentation updates
+
+ * [ACCUMULO-1218][ACCUMULO-1218] document the recovery from a failed zookeeper
+ * [ACCUMULO-1375][ACCUMULO-1375] Update README files in proxy module.
+ * [ACCUMULO-1407][ACCUMULO-1407] Fix documentation for deleterows
+ * [ACCUMULO-1428][ACCUMULO-1428] Document native maps
+ * [ACCUMULO-1946][ACCUMULO-1946] Include dfs.datanode.synconclose in hdfs 
configuration documentation
+ * [ACCUMULO-1956][ACCUMULO-1956] Add section on decomissioning or adding 
nodes to an Accumulo cluster
+ * [ACCUMULO-2441][ACCUMULO-2441] Document internal state stored in RFile names
+ * [ACCUMULO-2590][ACCUMULO-2590] Update public API in readme to clarify 
what's included
+
+## Testing
+
+Below is a list of all platforms that 1.6.0 was tested against by developers. 
Each Apache Accumulo release
+has a set of tests that must be run before the candidate is capable of 
becoming an official release. That list includes the following:
+
+ 1. Successfully run all unit tests
+ 2. Successfully run all functional test (test/system/auto)
+ 3. Successfully complete two 24-hour RandomWalk tests (LongClean module), 
with and without "agitation"
+ 4. Successfully complete two 24-hour Continuous Ingest tests, with and 
without "agitation", with data verification
+ 5. Successfully complete two 72-hour Continuous Ingest tests, with and 
without "agitation"
+
+Each unit and functional test only runs on a single node, while the RandomWalk 
and Continuous Ingest tests run 
+on any number of nodes. *Agitation* refers to randomly restarting Accumulo 
processes and Hadoop Datanode processes,
+and, in HDFS High-Availability instances, forcing NameNode failover.
+<table id="release_notes_testing">
+  <tr>
+    <th>OS</th>
+    <th>Hadoop</th>
+    <th>Nodes</th>
+    <th>ZooKeeper</th>
+    <th>HDFS High-Availability</th>
+    <th>Tests</th>
+  </tr>
+</table>
+
+[ACCUMULO-112]: https://issues.apache.org/jira/browse/ACCUMULO-112 "Partition 
data in memory by locality group"
+[ACCUMULO-118]: https://issues.apache.org/jira/browse/ACCUMULO-118 "Multiple 
namenode support"
+[ACCUMULO-324]: https://issues.apache.org/jira/browse/ACCUMULO-324 
"System/site constraints and iterators should NOT affect the METADATA table"
+[ACCUMULO-335]: https://issues.apache.org/jira/browse/ACCUMULO-335 "Batch 
scanning over the !METADATA table can cause issues"
+[ACCUMULO-391]: https://issues.apache.org/jira/browse/ACCUMULO-391 
"Multi-table input format"
+[ACCUMULO-802]: https://issues.apache.org/jira/browse/ACCUMULO-802 "Table 
namespaces"
+[ACCUMULO-842]: https://issues.apache.org/jira/browse/ACCUMULO-842 "Add FATE 
administration to shell"
+[ACCUMULO-958]: https://issues.apache.org/jira/browse/ACCUMULO-958 "Support 
pluggable encryption in walogs"
+[ACCUMULO-998]: https://issues.apache.org/jira/browse/ACCUMULO-998 "Support 
encryption at rest"
+[ACCUMULO-980]: https://issues.apache.org/jira/browse/ACCUMULO-980 "Support 
pluggable codecs for RFile"
+[ACCUMULO-1000]: https://issues.apache.org/jira/browse/ACCUMULO-1000 
"Conditional Mutations"
+[ACCUMULO-1009]: https://issues.apache.org/jira/browse/ACCUMULO-1009 "Support 
encryption over the wire"
+[ACCUMULO-1018]: https://issues.apache.org/jira/browse/ACCUMULO-1018 "Client 
does not give informative message when user can not read table"
+[ACCUMULO-1218]: https://issues.apache.org/jira/browse/ACCUMULO-1218 "document 
the recovery from a failed zookeeper"
+[ACCUMULO-1336]: https://issues.apache.org/jira/browse/ACCUMULO-1336 "Add 
lexicoders from Typo to Accumulo"
+[ACCUMULO-1375]: https://issues.apache.org/jira/browse/ACCUMULO-1375 "Update 
README files in proxy module."
+[ACCUMULO-1407]: https://issues.apache.org/jira/browse/ACCUMULO-1407 "Fix 
documentation for deleterows"
+[ACCUMULO-1428]: https://issues.apache.org/jira/browse/ACCUMULO-1428 "Document 
native maps"
+[ACCUMULO-1442]: https://issues.apache.org/jira/browse/ACCUMULO-1442 "Replace 
JLine with JLine2"
+[ACCUMULO-1451]: https://issues.apache.org/jira/browse/ACCUMULO-1451 "Make 
Compaction triggers extensible"
+[ACCUMULO-1481]: https://issues.apache.org/jira/browse/ACCUMULO-1481 "Root 
tablet in its own table"
+[ACCUMULO-1492]: https://issues.apache.org/jira/browse/ACCUMULO-1492 
"bin/accumulo should follow symbolic links"
+[ACCUMULO-1507]: https://issues.apache.org/jira/browse/ACCUMULO-1507 "Dynamic 
Classloader still can't keep proper track of jars"
+[ACCUMULO-1585]: https://issues.apache.org/jira/browse/ACCUMULO-1585 "Use node 
addresses from config files verbatim"
+[ACCUMULO-1562]: https://issues.apache.org/jira/browse/ACCUMULO-1562 "add a 
troubleshooting section to the user guide"
+[ACCUMULO-1566]: https://issues.apache.org/jira/browse/ACCUMULO-1566 "Add 
ability for client to start Scanner readahead immediately"
+[ACCUMULO-1572]: https://issues.apache.org/jira/browse/ACCUMULO-1572 "Single 
node zookeeper failure kills connected accumulo servers"
+[ACCUMULO-1585]: https://issues.apache.org/jira/browse/ACCUMULO-1585 "Use 
FQDN/verbatim data from config files"
+[ACCUMULO-1588]: https://issues.apache.org/jira/browse/ACCUMULO-1588 "Monitor 
XML and JSON differ"
+[ACCUMULO-1628]: https://issues.apache.org/jira/browse/ACCUMULO-1628 "NPE on 
deep copied dumped memory iterator"
+[ACCUMULO-1661]: https://issues.apache.org/jira/browse/ACCUMULO-1661 
"AccumuloInputFormat cannot fetch empty column family"
+[ACCUMULO-1664]: https://issues.apache.org/jira/browse/ACCUMULO-1664 "Make all 
processes able to use random ports"
+[ACCUMULO-1667]: https://issues.apache.org/jira/browse/ACCUMULO-1667 "Allow 
On/Offline Command To Execute Synchronously"
+[ACCUMULO-1696]: https://issues.apache.org/jira/browse/ACCUMULO-1696 "Deep 
copy in the compaction scope iterators can throw off the stats"
+[ACCUMULO-1698]: https://issues.apache.org/jira/browse/ACCUMULO-1698 
"stop-here doesn't consider system hostname"
+[ACCUMULO-1704]: https://issues.apache.org/jira/browse/ACCUMULO-1704 
"IteratorSetting missing (int,String,Class,Map) constructor"
+[ACCUMULO-1708]: https://issues.apache.org/jira/browse/ACCUMULO-1708 "Error 
during minor compaction left tserver in bad state"
+[ACCUMULO-1808]: https://issues.apache.org/jira/browse/ACCUMULO-1808 "Create 
compaction strategy that has size limit"
+[ACCUMULO-1833]: https://issues.apache.org/jira/browse/ACCUMULO-1833 
"MultiTableBatchWriterImpl.getBatchWriter() is not performant for multiple 
threads"
+[ACCUMULO-1901]: https://issues.apache.org/jira/browse/ACCUMULO-1901 
"start-here.sh starts only one GC process even if more are defined"
+[ACCUMULO-1905-comment]: 
https://issues.apache.org/jira/browse/ACCUMULO-1905?focusedCommentId=13915208&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13915208
+[ACCUMULO-1921]: https://issues.apache.org/jira/browse/ACCUMULO-1921 "NPE in 
tablet assignment"
+[ACCUMULO-1946]: https://issues.apache.org/jira/browse/ACCUMULO-1946 "Include 
dfs.datanode.synconclose in hdfs configuration documentation"
+[ACCUMULO-1950]: https://issues.apache.org/jira/browse/ACCUMULO-1950 "Reduce 
the number of calls to hsync"
+[ACCUMULO-1956]: https://issues.apache.org/jira/browse/ACCUMULO-1956 "Add 
section on decomissioning or adding nodes to an Accumulo cluster"
+[ACCUMULO-1958]: https://issues.apache.org/jira/browse/ACCUMULO-1958 "Range 
constructor lacks key checks, should be non-public"
+[ACCUMULO-1994]: https://issues.apache.org/jira/browse/ACCUMULO-1994 "Proxy 
does not handle Key timestamps correctly"
+[ACCUMULO-2008]: https://issues.apache.org/jira/browse/ACCUMULO-2008 "Block 
cache reserves section for in-memory blocks"
+[ACCUMULO-2059]: https://issues.apache.org/jira/browse/ACCUMULO-2059 
"Namespace constraints easily get clobbered by table constraints"
+[ACCUMULO-2128]: https://issues.apache.org/jira/browse/ACCUMULO-2128 "Provide 
resource cleanup via static utility rather than Instance.close"
+[ACCUMULO-2174]: https://issues.apache.org/jira/browse/ACCUMULO-2174 "VFS 
Classloader has potential to collide localized resources"
+[ACCUMULO-2225]: https://issues.apache.org/jira/browse/ACCUMULO-2225 "Need to 
better handle DNS failure propagation from Hadoop"
+[ACCUMULO-2234]: https://issues.apache.org/jira/browse/ACCUMULO-2234 "Cannot 
run offline mapreduce over non-default instance.dfs.dir value"
+[ACCUMULO-2334]: https://issues.apache.org/jira/browse/ACCUMULO-2334 "Lacking 
fallback when ACCUMULO_LOG_HOST isn't set"
+[ACCUMULO-2408]: https://issues.apache.org/jira/browse/ACCUMULO-2408 "metadata 
table not assigned after root table is loaded"
+[ACCUMULO-2441]: https://issues.apache.org/jira/browse/ACCUMULO-2441 "Document 
internal state stored in RFile names"
+[ACCUMULO-2495]: https://issues.apache.org/jira/browse/ACCUMULO-2495 "OOM 
exception didn't bring down tserver"
+[ACCUMULO-2519]: https://issues.apache.org/jira/browse/ACCUMULO-2519 "FATE 
operation failed across upgrade"
+[ACCUMULO-2590]: https://issues.apache.org/jira/browse/ACCUMULO-2590 "Update 
public API in readme to clarify what's included"

Propchange: accumulo/site/trunk/content/release_notes/1.6.0.mdtext
------------------------------------------------------------------------------
    svn:eol-style = native


Reply via email to