Repository: accumulo Updated Branches: refs/heads/gh-pages b649c3fd1 -> bca63397c
Draft Release notes for 1.8.0 Project: http://git-wip-us.apache.org/repos/asf/accumulo/repo Commit: http://git-wip-us.apache.org/repos/asf/accumulo/commit/bca63397 Tree: http://git-wip-us.apache.org/repos/asf/accumulo/tree/bca63397 Diff: http://git-wip-us.apache.org/repos/asf/accumulo/diff/bca63397 Branch: refs/heads/gh-pages Commit: bca63397c337921f681cd993d5f970b1bb2654e4 Parents: b649c3f Author: Michael Wall <mjw...@gmail.com> Authored: Thu Sep 1 14:21:46 2016 -0400 Committer: Michael Wall <mjw...@gmail.com> Committed: Thu Sep 1 14:21:46 2016 -0400 ---------------------------------------------------------------------- release_notes/1.8.0.md | 146 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 146 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/accumulo/blob/bca63397/release_notes/1.8.0.md ---------------------------------------------------------------------- diff --git a/release_notes/1.8.0.md b/release_notes/1.8.0.md new file mode 100644 index 0000000..e52d282 --- /dev/null +++ b/release_notes/1.8.0.md @@ -0,0 +1,146 @@ +--- +title: [DRAFT] Apache Accumulo 1.8.0 Release Notes [DRAFT] +nav: nav_rn_180 +--- + +*THIS DOCUMENT IS A DRAFT* + +Apache Accumulo 1.8.0 is a significant release that includes many important +milestone features which expand the functionality of Accumulo. These include +features related to security, availability, and extensibility. Over +340 JIRA issues were resolved in this version. This includes nearly +200 bug fixes and 71 improvements and 4 new features. See +[JIRA][JIRA_180] for the complete list. + +[JIRA_180]: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12312121&version=12329879 + +In the context of Accumulo's [Semantic Versioning][semver] [guidelines][api], +this is a "minor version". This means that new APIs have been created, some +deprecations may have been added, but no deprecated APIs have been removed. +Code written against 1.7.x should work against 1.8.0, likely binary-compatible +but definitely source-compatible. As always, the Accumulo developers take API compatibility +very seriously and have invested much time to ensure that we meet the promises set forth to our users. + +[api]: https://github.com/apache/accumulo/blob/1.8/README.md#api +[semver]: http://semver.org + +## Major Changes + +### Speed up WAL roll overs + +By creating an active and standby writeahead log entry, the roll becomes just swapping writers. This was a substantial refactor +in they way WALs worked, but seems to provide a increase in write speed as shown by the simple test below. The top entry is before +[ACCUMULO-3423][ACCUMULO-3423] and the bottom graph is after the refactor. + +![Graph of WAL speed up after ACCUMULO-3423][IMG-3423] + +[IMG-3423]: https://issues.apache.org/jira/secure/attachment/12705402/WAL-slowdown-graphs.jpg "Graph of WAL speed up after ACCUMULO-3423" +[ACCUMULO-3423]: https://issues.apache.org/jira/browse/ACCUMULO-3423 + +### User level API for RFile + +Previously the only public API available to write RFiles was via the AccumuloFileOutputFormat. There was no way to read RFiles in the public +API. [ACCUMULO-4165][ACCUMULO-4165] exposes new public API for reading and writing RFiles as well as cleans up some of the internal APIs. + +[ACCUMULO-4165]: https://issues.apache.org/jira/browse/ACCUMULO-4165 + +### Suspend Tablet assignment for rolling restarts + +When a tablet server dies, Accumulo attempted to reassign the tablets as quickly as possible to maintain availability. +A new configuration property `table.suspend.duration` with a default of 0s now controls how long to wait before reassigning +a tablet from a dead tserver. The property is configurable in Zookeeper, so you can set it, do a rolling restart, and then +set it back to 0. A new state as introduced, TableState.SUSPENDED to support this feature. By default, metadata tablet +reassignment is not suspended, but that can also be changed with the `master.metadata.suspendable` property that is false by +default. Root tablet assignment can not be suspended. See [ACCUMULO-4353] for more info. + +[ACCUMULO-4353]: https://issues.apache.org/jira/browse/ACCUMULO-4353 + +### Run multiple Tablet Servers on one node + +[ACCUMULO-4328] introduces the capability of running multiple tservers on a single node. This intended for nodes with a large +amount of memory. This feature is disabled by default. There are several related tickets: [ACCUMULO-4072], [ACCUMULO-4331] +and [ACCUMULO-4406]. Note, this changes the names of the log files. Previous log file names were defined in the +generic_logger.xml as `${org.apache.accumulo.core.application}_${instance}_${org.apache.accumulo.core.ip.localhost.hostname}.log`. +The files will now include the instance id after the application with +`${org.apache.accumulo.core.application}_${instance}_${org.apache.accumulo.core.ip.localhost.hostname}.log`. For example: +master_1_localhost.log instead of master_localhost.log. The same change was made to the debug logs as well. + +[ACCUMULO-4328]: https://issues.apache.org/jira/browse/ACCUMULO-4328 +[ACCUMULO-4072]: https://issues.apache.org/jira/browse/ACCUMULO-4072 +[ACCUMULO-4331]: https://issues.apache.org/jira/browse/ACCUMULO-4331 +[ACCUMULO-4406]: https://issues.apache.org/jira/browse/ACCUMULO-4406 + +### Rate limiting Major Compactions + +Major Compactions can overwhelm a tablet server, rendering it nearly unresponsive. [ACCUMULO-4187] take a cue from Apache +Cassandra and restrict how quickly we perform major compactions by rate limiting reads and writes. This has a direct affect +on the IO load caused by major compactions, and should also indirectly affect the CPU load. This behavior is controlled +by a new property `tserver.compaction.major.throughput` with a defaults of 0B which disables the rate limiting. + +[ACCUMULO-4187]: https://issues.apache.org/jira/browse/ACCUMULO-4187 + +### Upgrade to Apache Thrift 0.9.3 + +Accumulo relies on Apache Thrift to implement remote procedure calls between Accumulo services. +Ticket [ACCUMULO-4077][ACCUMULO-4077] updates our dependency to 0.9.3. See the [Apache Thrift 0.9.3 Release Notes][THRIFT-0.9.3-RN] for details +on the changes to Thrift. + +[ACCUMULO-4077]: https://issues.apache.org/jira/browse/ACCUMULO-4077 +[THRIFT-0.9.3-RN]: https://github.com/apache/thrift/blob/0.9.3/CHANGES + +### Iterator fuzz testing + +Users often write iterators without fully understanding its limits and lifetime. Accumulo should have an +iterator fuzz-tester which will take user data and run the iterator under extreme conditions. For example, +it should re-create and re-seek the iterator with every key returned. It could automatically compare results +of such a run with the naive run, which seeks to the beginning and scans all the data. See +[ACCUMULO-626][ACCUMULO-626] for more details. + +[ACCUMULO-626]: https://issues.apache.org/jira/browse/ACCUMULO-626 + +### Default port for Monitor changed to 9995 + +Previously, the default port for the monitor was 50095. You will need to update your links to point to port 9995. The default port for the GC process was also changed from 50091 to 9998, although this mostly used internally. Ticket [ACCUMULO-3409] +documents why the defaults where changed to something outside the range of ephemeral ports. These values are still configurable by setting `monitor.port.client` +and `gc.port.client` in the accumulo-site.xml + +[ACCUMULO-3409]: https://issues.apache.org/jira/browse/ACCUMULO-3409 + + +## Other Notable Changes + + * [ACCUMULO-1055][ACCUMULO-1055] Configurable maximum file size for merging minor compactions + * [ACCUMULO-1124][ACCUMULO-1124] Optimization of RFile index + * [ACCUMULO-2883][ACCUMULO-2883] API to fetch current tablet assignments + * [ACCUMULO-3871][ACCUMULO-3871] Support for running integration tests in MapReduce + * [ACCUMULO-3920][ACCUMULO-3920] Deprecate the MockAccumulo class and remove usage in our tests + * [ACCUMULO-4339][ACCUMULO-4339] Make hadoop-minicluster optional dependency of acccumulo-minicluster + * [ACCUMULO-4354][ACCUMULO-4354] Bump dependency versions to include gson, jetty, and sl4j + * [ACCUMULO-3735][ACCUMULO-3735] Bulk Import status page on the monitor + + [ACCUMULO-3735]: https://issues.apache.org/jira/browse/ACCUMULO-3735 + [ACCUMULO-4354]: https://issues.apache.org/jira/browse/ACCUMULO-4354 + [ACCUMULO-4339]: https://issues.apache.org/jira/browse/ACCUMULO-4339 + [ACCUMULO-3920]: https://issues.apache.org/jira/browse/ACCUMULO-3920 + [ACCUMULO-3871]: https://issues.apache.org/jira/browse/ACCUMULO-3871 + [ACCUMULO-2883]: https://issues.apache.org/jira/browse/ACCUMULO-2883 + [ACCUMULO-1055]: https://issues.apache.org/jira/browse/ACCUMULO-1055 + [ACCUMULO-1124]: https://issues.apache.org/jira/browse/ACCUMULO-1124 + +## Testing + +Each unit and functional test only runs on a single node, while the RandomWalk +and Continuous Ingest tests run on any number of nodes. *Agitation* refers to +randomly restarting Accumulo processes and Hadoop Datanode processes, and, in +HDFS High-Availability instances, forcing NameNode failover. + +{: #release_notes_testing .table } +| OS/Environment | Hadoop | Nodes | ZooKeeper | HDFS HA | Tests | +|----------------------------------------------------------------------------|----------------------|-------|------------------|---------|----------------------------------------------| +| CentOS7/openJDK7/EC2; 3 m3.xlarge leaders, 8 d2.xlarge workers | 2.6.4 | 11 | 3.4.8 | No | 24 HR Continuous Ingest without Agitation. | +| CentOS7/openJDK7/EC2; 3 m3.xlarge leaders, 8 d2.xlarge workers | 2.6.4 | 11 | 3.4.8 | No | 16 HR Continuous Ingest with Agitation. | +| CentOS7/openJDK7/OpenStack VMs (16G RAM 2cores 2disk3; 1 leader, 5 workers | HDP 2.5 (Hadoop 2.7) | 7 | HDP 2.5 (ZK 3.4) | No | 24 HR Continuous Ingest without Agitation. | +| CentOS7/openJDK7/OpenStack VMs (16G RAM 2cores 2disk3; 1 leader, 5 workers | HDP 2.5 (Hadoop 2.7) | 7 | HDP 2.5 (ZK 3.4) | No | 24 HR Continuous Ingest with Agitation. | + + +*THIS DOCUMENT IS A DRAFT*