This is an automated email from the ASF dual-hosted git repository. git-site-role pushed a commit to branch asf-staging in repository https://gitbox.apache.org/repos/asf/accumulo-website.git
The following commit(s) were added to refs/heads/asf-staging by this push: new 487f2ee7 Automatic Site Publish by Buildbot 487f2ee7 is described below commit 487f2ee777812e91a95a71ca0515759cc67830f8 Author: buildbot <us...@infra.apache.org> AuthorDate: Wed Nov 2 21:26:37 2022 +0000 Automatic Site Publish by Buildbot --- output/docs/2.x/administration/caching.html | 2 +- output/docs/2.x/administration/compaction.html | 12 ++--- output/docs/2.x/administration/fate.html | 6 +-- .../docs/2.x/administration/in-depth-install.html | 12 ++--- output/docs/2.x/administration/multivolume.html | 2 +- output/docs/2.x/administration/replication.html | 4 +- output/docs/2.x/administration/scan-executors.html | 8 ++-- output/docs/2.x/administration/upgrading.html | 10 ++-- output/docs/2.x/configuration/files.html | 6 +-- output/docs/2.x/configuration/overview.html | 4 +- .../docs/2.x/configuration/server-properties.html | 2 +- output/docs/2.x/development/iterators.html | 6 +-- output/docs/2.x/development/sampling.html | 2 +- output/docs/2.x/development/summaries.html | 6 +-- output/docs/2.x/getting-started/clients.html | 14 +++--- output/docs/2.x/getting-started/design.html | 4 +- output/docs/2.x/getting-started/features.html | 8 ++-- output/docs/2.x/getting-started/glossary.html | 2 +- output/docs/2.x/getting-started/quickstart.html | 6 +-- .../2.x/getting-started/table_configuration.html | 14 +++--- output/docs/2.x/security/authorizations.html | 4 +- output/docs/2.x/security/kerberos.html | 10 ++-- output/docs/2.x/security/on-disk-encryption.html | 2 +- output/docs/2.x/security/wire-encryption.html | 2 +- output/docs/2.x/troubleshooting/advanced.html | 10 ++-- output/docs/2.x/troubleshooting/basic.html | 6 +-- .../troubleshooting/system-metadata-tables.html | 2 +- output/docs/2.x/troubleshooting/tools.html | 4 +- output/feed.xml | 4 +- output/search_data.json | 56 +++++++++++----------- 30 files changed, 115 insertions(+), 115 deletions(-) diff --git a/output/docs/2.x/administration/caching.html b/output/docs/2.x/administration/caching.html index e4d0f769..8a1738ab 100644 --- a/output/docs/2.x/administration/caching.html +++ b/output/docs/2.x/administration/caching.html @@ -447,7 +447,7 @@ for tables where read performance is critical.</p> <h2 id="configuration">Configuration</h2> <p>The <a href="/docs/2.x/configuration/server-properties#tserver_cache_manager_class">tserver.cache.manager.class</a> property controls which block cache implementation is used within the tablet server. Users -can supply their own implementation and set custom configuration properties to control it’s behavior (see org.apache.accumulo.core.spi.cache.BlockCacheManager$Configuration.java).</p> +can supply their own implementation and set custom configuration properties to control its behavior (see org.apache.accumulo.core.spi.cache.BlockCacheManager$Configuration.java).</p> <p>The index and data block caches are configured for tables by the following properties:</p> diff --git a/output/docs/2.x/administration/compaction.html b/output/docs/2.x/administration/compaction.html index ca4ab9ab..3554020f 100644 --- a/output/docs/2.x/administration/compaction.html +++ b/output/docs/2.x/administration/compaction.html @@ -432,7 +432,7 @@ <p>In Accumulo each tablet has a list of files associated with it. As data is written to Accumulo it is buffered in memory. The data buffered in memory is -eventually written to files in DFS on a per tablet basis. Files can also be +eventually written to files in DFS on a per-tablet basis. Files can also be added to tablets directly by bulk import. In the background tablet servers run major compactions to merge multiple files into one. The tablet server has to decide which tablets to compact and which files within a tablet to compact.</p> @@ -457,7 +457,7 @@ a set. The default planner looks for file sets where LFS*CR <= FSS. By only compacting sets of files that meet this requirement the amount of work done by compactions is O(N * log<sub>CR</sub>(N)). Increasing the ratio will result in less compaction work and more files per tablet. More files per -tablet means more higher query latency. So adjusting this ratio is a trade off +tablet means higher query latency. So adjusting this ratio is a trade-off between ingest and query performance.</p> <p>When CR=1.0 this will result in a goal of a single per file tablet, but the @@ -474,7 +474,7 @@ of this documentation only applies to Accumulo 2.1 and later.</p> <ul> <li>Create a compaction service named <code class="language-plaintext highlighter-rouge">cs1</code> that has three executors. The first executor named <code class="language-plaintext highlighter-rouge">small</code> has 8 threads and runs compactions less than 16M. The second executor <code class="language-plaintext highlighter-rouge">medium</code> runs compactions less than 128M with 4 threads. The last executor <code class="language-plaintext highlighter-rouge">large</code> runs al [...] - <li>Create a compaction service named <code class="language-plaintext highlighter-rouge">cs2</code> that has three executors. It has similar config to <code class="language-plaintext highlighter-rouge">cs1</code>, but its executors have less threads. Limits total I/O of all compactions within the service to 40MB/s.</li> + <li>Create a compaction service named <code class="language-plaintext highlighter-rouge">cs2</code> that has three executors. It has similar config to <code class="language-plaintext highlighter-rouge">cs1</code>, but its executors have fewer threads. Limits total I/O of all compactions within the service to 40MB/s.</li> <li>Configure table <code class="language-plaintext highlighter-rouge">ci</code> to use compaction service <code class="language-plaintext highlighter-rouge">cs1</code> for system compactions and service <code class="language-plaintext highlighter-rouge">cs2</code> for user compactions.</li> </ul> @@ -526,7 +526,7 @@ in an Accumulo deployment:</p> <h3 id="configuration-1">Configuration</h3> <p>Configuration for external compactions is very similar to the internal compaction example above. -In the example below we create a Compaction Service <code class="language-plaintext highlighter-rouge">cs1</code> and configure it with an queue +In the example below we create a Compaction Service <code class="language-plaintext highlighter-rouge">cs1</code> and configure it with a queue named <code class="language-plaintext highlighter-rouge">DCQ1</code>. We then define the Compaction Dispatcher on table <code class="language-plaintext highlighter-rouge">testTable</code> and configure the table to use the <code class="language-plaintext highlighter-rouge">cs1</code> Compaction Service for planning and executing all compactions.</p> @@ -547,7 +547,7 @@ config -t testTable -s table.compaction.dispatcher.opts.service=cs1 <p>When a Compactor is free to perform work, it asks the CompactionCoordinator for the next compaction job. The CompactionCoordinator contacts the next TabletServer that has the highest priority for the Compactor’s queue. The TabletServer returns the information necessary for the compaction to occur to the CompactionCoordinator, which is passed on to the Compactor. The Compaction Coordinator maintains an in-memory list of running compactions and also inserts an entry into the metadata ta [...] -<p>External compactions handle faults and major system events in Accumulo. When a compactor process dies this will be detected and any files it had reserved in a tablet will be unreserved. When a tserver dies, this will not impact any external compactions running on behalf of tablets that tserver was hosting. The case of tablets not being hosted on an tserver when an external compaction tries to commit is also handled. Tablets being deleted (by split, merge, or table deletion) will ca [...] +<p>External compactions handle faults and major system events in Accumulo. When a compactor process dies this will be detected and any files it had reserved in a tablet will be unreserved. When a tserver dies, this will not impact any external compactions running on behalf of tablets that tserver was hosting. The case of tablets not being hosted on a tserver when an external compaction tries to commit is also handled. Tablets being deleted (by split, merge, or table deletion) will cau [...] <h3 id="external-compaction-in-action">External Compaction in Action</h3> @@ -661,7 +661,7 @@ false, kind:USER) <p>The names of compaction services and executors are used in logging. The log messages below are from a tserver with the configuration above with data being -written to the ci table. Also a compaction of the table was forced from the +written to the ci table. Also, a compaction of the table was forced from the shell.</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>2020-06-25T16:34:31,669 [tablet.files] DEBUG: Compacting 3;667;6 on cs1.small for SYSTEM from [C00001cm.rf, C00001a7.rf, F00001db.rf] size 15 MB diff --git a/output/docs/2.x/administration/fate.html b/output/docs/2.x/administration/fate.html index 252f9e16..307c8ae4 100644 --- a/output/docs/2.x/administration/fate.html +++ b/output/docs/2.x/administration/fate.html @@ -458,7 +458,7 @@ of these operations. This property is also what guarantees safety in light of fa <h3 id="repo-stack">REPO Stack</h3> <p>A FATE transaction is composed of a sequence of Repeatable persisted operations (REPO). In order to start a FATE transaction, -a REPO is pushed onto a per transaction REPO stack. The top of the stack always contains the +a REPO is pushed onto a per-transaction REPO stack. The top of the stack always contains the next REPO the FATE transaction should execute. When a REPO is successful it may return another REPO which is pushed on the stack.</p> @@ -572,11 +572,11 @@ invoke. It is not normal to invoke this command.</p> <p>This command accepts zero more transaction IDs. If given no transaction IDs, it will dump all active transactions. A FATE operations is compromised as a sequence of REPOs. In order to start a FATE transaction, a REPO is pushed onto -a per transaction REPO stack. The top of the stack always contains the next +a per-transaction REPO stack. The top of the stack always contains the next REPO the FATE transaction should execute. When a REPO is successful it may return another REPO which is pushed on the stack. The <code class="language-plaintext highlighter-rouge">dump</code> command will print all of the REPOs on each transactions stack. The REPOs are serialized to -JSON in order to make them human readable.</p> +JSON in order to make them human-readable.</p> <div class="row" style="margin-top: 20px;"> diff --git a/output/docs/2.x/administration/in-depth-install.html b/output/docs/2.x/administration/in-depth-install.html index f6409616..6e8e8b93 100644 --- a/output/docs/2.x/administration/in-depth-install.html +++ b/output/docs/2.x/administration/in-depth-install.html @@ -591,7 +591,7 @@ manually or run <code class="language-plaintext highlighter-rouge">accumulo-clus <p>Logging is configured in <a href="/docs/2.x/configuration/files#accumulo-envsh">accumulo-env.sh</a> to use three log4j configuration files in <code class="language-plaintext highlighter-rouge">conf/</code>. The file used depends on the Accumulo command or service being run. Logging for most Accumulo services -(i.e Manager, TabletServer, Garbage Collector) is configured by <a href="/docs/2.x/configuration/files#log4j-serviceproperties">log4j-service.properties</a> except for +(i.e. Manager, TabletServer, Garbage Collector) is configured by <a href="/docs/2.x/configuration/files#log4j-serviceproperties">log4j-service.properties</a> except for the Monitor which is configured by <a href="/docs/2.x/configuration/files#log4j-monitorproperties">log4j-monitor.properties</a>. All Accumulo commands (i.e <code class="language-plaintext highlighter-rouge">init</code>, <code class="language-plaintext highlighter-rouge">shell</code>, etc) are configured by <a href="/docs/2.x/configuration/files#log4jproperties">log4j.properties</a>.</p> @@ -745,7 +745,7 @@ instance, Accumulo identifies <code class="language-plaintext highlighter-rouge" <h3 id="deploy-configuration">Deploy Configuration</h3> <p>Copy <a href="/docs/2.x/configuration/files#accumulo-envsh">accumulo-env.sh</a> and <a href="/docs/2.x/configuration/files#accumuloproperties">accumulo.properties</a> from the <code class="language-plaintext highlighter-rouge">conf/</code> directory on the manager to all -Accumulo tablet servers. The “host” configuration files files <code class="language-plaintext highlighter-rouge">accumulo-cluster</code> only need to be on +Accumulo tablet servers. The “host” configuration files <code class="language-plaintext highlighter-rouge">accumulo-cluster</code> only need to be on servers where that command is run.</p> <h3 id="sensitive-configuration-values">Sensitive Configuration Values</h3> @@ -841,7 +841,7 @@ Below is an example for specify the app1 context in the <a href="/docs/2.x/confi general.vfs.context.classpath.app1=hdfs://localhost:8020/applicationA/classpath/.*.jar,file:///opt/applicationA/lib/.*.jar </code></pre></div></div> -<p>The default behavior follows the Java ClassLoader contract in that classes, if they exists, are +<p>The default behavior follows the Java ClassLoader contract in that classes, if they exist, are loaded from the parent classloader first. You can override this behavior by delegating to the parent classloader after looking in this classloader first. An example of this configuration is:</p> @@ -925,7 +925,7 @@ stopped and recovery will not need to be performed when the tablets are re-hoste <p>Occasionally, it might be necessary to restart the processes on a specific node. In addition to the <code class="language-plaintext highlighter-rouge">accumulo-cluster</code> script, Accumulo has a <code class="language-plaintext highlighter-rouge">accumulo-service</code> script that -can be use to start/stop processes on a node.</p> +can be used to start/stop processes on a node.</p> <h4 id="a-note-on-rolling-restarts">A note on rolling restarts</h4> @@ -1213,11 +1213,11 @@ can be exacerbated by resource constraints and clock drift.</p> <p>Each release of Accumulo is built with a specific version of Apache Hadoop, Apache ZooKeeper and Apache Thrift. We expect Accumulo to work with versions that are API compatible with those versions. -However this compatibility is not guaranteed because Hadoop, ZooKeeper +However, this compatibility is not guaranteed because Hadoop, ZooKeeper and Thrift may not provide guarantees between their own versions. We have also found that certain versions of Accumulo and Hadoop included bugs that greatly affected overall stability. Thrift is particularly -prone to compatibility changes between versions and you must use the +prone to compatibility changes between versions, and you must use the same version your Accumulo is built with.</p> <p>Please check the release notes for your Accumulo version or use the diff --git a/output/docs/2.x/administration/multivolume.html b/output/docs/2.x/administration/multivolume.html index 35c66916..807ffd21 100644 --- a/output/docs/2.x/administration/multivolume.html +++ b/output/docs/2.x/administration/multivolume.html @@ -480,7 +480,7 @@ need to be restarted.</p> managing the fully qualified URIs stored in Accumulo. Viewfs and HA namenode both introduce a level of indirection in the Hadoop configuration. For example assume viewfs:///nn1 maps to hdfs://nn1 in the Hadoop configuration. -If viewfs://nn1 is used by Accumulo, then its easy to map viewfs://nn1 to +If viewfs://nn1 is used by Accumulo, then it’s easy to map viewfs://nn1 to hdfs://nnA by changing the Hadoop configuration w/o doing anything to Accumulo. A production system should probably use a HA namenode. Viewfs may be useful on a test system with a single non HA namenode.</p> diff --git a/output/docs/2.x/administration/replication.html b/output/docs/2.x/administration/replication.html index 43716d3c..28405260 100644 --- a/output/docs/2.x/administration/replication.html +++ b/output/docs/2.x/administration/replication.html @@ -452,7 +452,7 @@ space available on the primary system.</p> <p>Replication configurations can be considered as a directed graph which allows cycles. The systems in which data was replicated from is maintained in each Mutation which -allow each system to determine if a peer has already has the data in which +allow each system to determine if a peer already has the data in which the system wants to send.</p> <p>Data is replicated by using the Write-Ahead logs (WAL) that each TabletServer is @@ -659,7 +659,7 @@ root@peer> grant -t my_table -u peer Table.READ root@peer> tables -l </code></pre></div></div> -<p>Remember what the table ID for ‘my_table’ is. You’ll need that to configured the primary instance.</p> +<p>Remember what the table ID for ‘my_table’ is. You’ll need that to configure the primary instance.</p> <h3 id="primary-1">Primary</h3> diff --git a/output/docs/2.x/administration/scan-executors.html b/output/docs/2.x/administration/scan-executors.html index 17c3ee69..ee45e444 100644 --- a/output/docs/2.x/administration/scan-executors.html +++ b/output/docs/2.x/administration/scan-executors.html @@ -480,7 +480,7 @@ SimpleScanDispatcher supports an <code class="language-plaintext highlighter-rou executor. If this option is not set, then SimpleScanDispatcher will dispatch to the scan executor named <code class="language-plaintext highlighter-rouge">default</code>.</p> -<p>To to tie everything together, consider the following use case.</p> +<p>To tie everything together, consider the following use case.</p> <ul> <li>Create tables named LOW1 and LOW2 using a scan executor with a single thread.</li> @@ -511,7 +511,7 @@ config -t HIGH -s table.scan.dispatcher=org.apache.accumulo.core.spi.scan.Simple config -t HIGH -s table.scan.dispatcher.opts.executor=high </code></pre></div></div> -<p>While not necessary because its the default, it is safer to also set +<p>While not necessary because it’s the default, it is safer to also set <code class="language-plaintext highlighter-rouge">table.scan.dispatcher=org.apache.accumulo.core.spi.scan.SimpleScanDispatcher</code> for each table. This ensures things work as expected in the case where <code class="language-plaintext highlighter-rouge">table.scan.dispatcher</code> was set at the system or namespace level.</p> @@ -519,7 +519,7 @@ for each table. This ensures things work as expected in the case where <h3 id="configuring-and-using-scan-prioritizers">Configuring and using Scan Prioritizers.</h3> <p>When all scan executor threads are busy, incoming work is queued. By -default this queue has a FIFO order. A <a href="https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.1.0/org/apache/accumulo/core/spi/scan/ScanPrioritizer.html">ScanPrioritizer</a> can be configured to +default, this queue has a FIFO order. A <a href="https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.1.0/org/apache/accumulo/core/spi/scan/ScanPrioritizer.html">ScanPrioritizer</a> can be configured to reorder the queue. Accumulo ships with the <a href="https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.1.0/org/apache/accumulo/core/spi/scan/IdleRatioScanPrioritizer.html">IdleRatioScanPrioritizer</a> which orders the queue by the ratio of run time to idle time. For example, a scan with a run time of 50ms and an idle time of 200ms would have a ratio of .25. @@ -579,7 +579,7 @@ priority of 1.</p> <p>Execution Hints can also be used to influence how the block caches are used for a scan. The following configuration would modify the <code class="language-plaintext highlighter-rouge">gamma</code> executor to use blocks -in the cache if they are already cached, but would never load mising blocks into the +in the cache if they are already cached, but would never load missing blocks into the cache.</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>config -t tex -s table.scan.dispatcher.opts.cacheUsage.gamma=opportunistic diff --git a/output/docs/2.x/administration/upgrading.html b/output/docs/2.x/administration/upgrading.html index 1249678c..e803e7e2 100644 --- a/output/docs/2.x/administration/upgrading.html +++ b/output/docs/2.x/administration/upgrading.html @@ -505,7 +505,7 @@ in the new file are different so take care when customizing.</li> </li> <li><code class="language-plaintext highlighter-rouge">accumulo-env.sh</code> constructs environment variables (such as <code class="language-plaintext highlighter-rouge">JAVA_OPTS</code> and <code class="language-plaintext highlighter-rouge">CLASSPATH</code>) used when running Accumulo processes <ul> - <li>This file was used in Accumulo 1.x but has changed signficantly for 2.0</li> + <li>This file was used in Accumulo 1.x but has changed significantly for 2.0</li> <li>Environment variables (such as <code class="language-plaintext highlighter-rouge">$cmd</code>, <code class="language-plaintext highlighter-rouge">$bin</code>, <code class="language-plaintext highlighter-rouge">$conf</code>) are set before <code class="language-plaintext highlighter-rouge">accumulo-env.sh</code> is loaded and can be used to customize environment.</li> <li>The <code class="language-plaintext highlighter-rouge">JAVA_OPTS</code> variable is constructed in <code class="language-plaintext highlighter-rouge">accumulo-env.sh</code> to pass command-line arguments to the <code class="language-plaintext highlighter-rouge">java</code> command that the starts Accumulo processes (i.e. <code class="language-plaintext highlighter-rouge">java $JAVA_OPTS main.class.for.$cmd</code>).</li> @@ -540,7 +540,7 @@ that users start using the new API, the old API will continue to be supported th <p>Below is a list of client API changes that users are required to make for 2.0:</p> <ul> - <li>Update your pom.xml use Accumulo 2.0. Also, update any Hadoop & ZooKeeper dependencies in your pom.xml to match the versions runing on your cluster. + <li>Update your pom.xml use Accumulo 2.0. Also, update any Hadoop & ZooKeeper dependencies in your pom.xml to match the versions running on your cluster. <div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt"><dependency></span> <span class="nt"><groupId></span>org.apache.accumulo<span class="nt"></groupId></span> <span class="nt"><artifactId></span>accumulo-core<span class="nt"></artifactId></span> @@ -548,7 +548,7 @@ that users start using the new API, the old API will continue to be supported th <span class="nt"></dependency></span> </code></pre></div> </div> </li> - <li>ClientConfiguration objects can no longer be ceated using <code class="language-plaintext highlighter-rouge">new ClientConfiguration()</code>. + <li>ClientConfiguration objects can no longer be created using <code class="language-plaintext highlighter-rouge">new ClientConfiguration()</code>. <ul> <li>Use <code class="language-plaintext highlighter-rouge">ClientConfiguration.create()</code> instead</li> </ul> @@ -562,7 +562,7 @@ that users start using the new API, the old API will continue to be supported th <ul> <li>The API for <a href="/docs/2.x/getting-started/clients#creating-an-accumulo-client">creating Accumulo clients</a> has changed in 2.0. <ul> - <li>The old API using <a href="https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.1.0/org/apache/accumulo/core/client/ZooKeeperInstance.html">ZooKeeeperInstance</a>, <a href="https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.1.0/org/apache/accumulo/core/client/Connector.html">Connector</a>, <a href="https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.1.0/org/apache/accumulo/core/client/Instance.html">Instance</a>, and <a href="https://static.javadoc [...] + <li>The old API using <a href="https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.1.0/org/apache/accumulo/core/client/ZooKeeperInstance.html">ZooKeeperInstance</a>, <a href="https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.1.0/org/apache/accumulo/core/client/Connector.html">Connector</a>, <a href="https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.1.0/org/apache/accumulo/core/client/Instance.html">Instance</a>, and <a href="https://static.javadoc. [...] <li><a href="https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.1.0/org/apache/accumulo/core/client/Connector.html">Connector</a> objects can be created from an <a href="https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.1.0/org/apache/accumulo/core/client/AccumuloClient.html">AccumuloClient</a> object using <a href="https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.1.0/org/apache/accumulo/core/client/Connector.html#from-org.apache.accumulo.core.cl [...] </ul> </li> @@ -589,7 +589,7 @@ when creating your MapReduce job.</li> <p>The recommended way to upgrade from a prior 1.7.x release is to stop Accumulo, upgrade to 1.7.y and then start 1.7.y.</p> -<p>When upgrading, there is a known issue if the upgrade fails due to outstanding <a href="/1.7/accumulo_user_manual.html#_fault_tolerant_executor_fate">FATE</a> operations, see <a href="https://issues.apache.org/jira/browse/ACCUMULO-4496">ACCUMULO-4496</a> The work around if this situation is encountered:</p> +<p>When upgrading, there is a known issue if the upgrade fails due to outstanding <a href="/1.7/accumulo_user_manual.html#_fault_tolerant_executor_fate">FATE</a> operations, see <a href="https://issues.apache.org/jira/browse/ACCUMULO-4496">ACCUMULO-4496</a> The workaround if this situation is encountered:</p> <ul> <li>Start tservers</li> diff --git a/output/docs/2.x/configuration/files.html b/output/docs/2.x/configuration/files.html index de0bf067..76177290 100644 --- a/output/docs/2.x/configuration/files.html +++ b/output/docs/2.x/configuration/files.html @@ -437,7 +437,7 @@ <p>The <a href="https://github.com/apache/accumulo/blob/main/assemble/conf/accumulo.properties">accumulo.properties</a> file configures Accumulo server processes using <a href="/docs/2.x/configuration/server-properties">server properties</a>. This file can be found in the <code class="language-plaintext highlighter-rouge">conf/</code> -direcory. It is needed on every host that runs Accumulo processes. Therfore, any configuration should be +directory. It is needed on every host that runs Accumulo processes. Therefore, any configuration should be replicated to all hosts of the Accumulo cluster. If a property is not configured here, it might have been <a href="/docs/2.x/configuration/overview">configured another way</a>. See the <a href="/docs/2.x/getting-started/quickstart#configuring-accumulo">quick start</a> for help with configuring this file.</p> @@ -499,11 +499,11 @@ to run standby Monitors that can take over if the lead Monitor fails.</p> <h3 id="tserver">tserver</h3> <p>Contains list of hosts where <a href="/docs/2.x/getting-started/design#tablet-server">Tablet Server</a> processes should run. While only one host is needed, it is recommended that -multiple tablet servers are run for improved fault tolerance and peformance.</p> +multiple tablet servers are run for improved fault tolerance and performance.</p> <h3 id="sserver">sserver</h3> -<p>Contains a list of hosts where <a href="/docs/2.x/getting-started/design#scan-server-experimental">ScanServer</a> processes should run. While only only one host is needed, it is recommended +<p>Contains a list of hosts where <a href="/docs/2.x/getting-started/design#scan-server-experimental">ScanServer</a> processes should run. While only one host is needed, it is recommended that multiple ScanServers are run for improved performance.</p> <h3 id="compaction-coordinator">compaction coordinator</h3> diff --git a/output/docs/2.x/configuration/overview.html b/output/docs/2.x/configuration/overview.html index aa4fca1c..eaae79ba 100644 --- a/output/docs/2.x/configuration/overview.html +++ b/output/docs/2.x/configuration/overview.html @@ -439,7 +439,7 @@ <h2 id="server-configuration">Server Configuration</h2> -<p>Accumulo processes (i.e manager, tablet server, monitor, etc) are configured by <a href="/docs/2.x/configuration/server-properties">server properties</a> whose values can be +<p>Accumulo processes (i.e. manager, tablet server, monitor, etc.) are configured by <a href="/docs/2.x/configuration/server-properties">server properties</a> whose values can be set in the following configuration locations (with increasing precedence):</p> <ol> @@ -498,7 +498,7 @@ the following shell command:</p> </code></pre></div></div> <h3 id="namespace">Namespace</h3> -<p>Namespace configuration refers to <a href="/docs/2.x/configuration/server-properties#table_prefix">table.* properties</a> set for a certain table namespace (i.e group of tables). These settings are stored in ZooKeeper. Namespace configuration +<p>Namespace configuration refers to <a href="/docs/2.x/configuration/server-properties#table_prefix">table.* properties</a> set for a certain table namespace (i.e. group of tables). These settings are stored in ZooKeeper. Namespace configuration will override System configuration and can be set using the following shell command:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>config -ns NAMESPACE -s PROPERTY=VALUE diff --git a/output/docs/2.x/configuration/server-properties.html b/output/docs/2.x/configuration/server-properties.html index 45734896..7237b5b8 100644 --- a/output/docs/2.x/configuration/server-properties.html +++ b/output/docs/2.x/configuration/server-properties.html @@ -432,7 +432,7 @@ <!-- WARNING: Do not edit this file. It is a generated file that is copied from Accumulo build (from core/target/generated-docs) --> -<p>Below are properties set in <code class="language-plaintext highlighter-rouge">accumulo.properties</code> or the Accumulo shell that configure Accumulo servers (i.e tablet server, manager, etc). Properties labeled ‘Experimental’ should not be considered stable and have a higher risk of changing in the future.</p> +<p>Below are properties set in <code class="language-plaintext highlighter-rouge">accumulo.properties</code> or the Accumulo shell that configure Accumulo servers (i.e. tablet server, manager, etc). Properties labeled ‘Experimental’ should not be considered stable and have a higher risk of changing in the future.</p> <table> <thead> diff --git a/output/docs/2.x/development/iterators.html b/output/docs/2.x/development/iterators.html index 82dc9aa8..71b8225e 100644 --- a/output/docs/2.x/development/iterators.html +++ b/output/docs/2.x/development/iterators.html @@ -493,7 +493,7 @@ optimize itself.</p> <p>The purpose of the seek method is to advance the stream of Key-Value pairs to a certain point in the iteration (the Accumulo table). It is common that before the implementation of this method returns some additional processing is performed which may further advance the current position past the <code class="language-plaintext highlighter-rouge">startKey</code> of the <a href="https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.1.0/org/apache/accumulo/core/data/Range.html">Range</a>. This, however, is dependent on the functionality the iterator provides. For -example, a filtering iterator would consume a number Key-Value pairs which do not meets its criteria before <code class="language-plaintext highlighter-rouge">seek</code> +example, a filtering iterator would consume a number Key-Value pairs which do not meet its criteria before <code class="language-plaintext highlighter-rouge">seek</code> returns. The important condition for <code class="language-plaintext highlighter-rouge">seek</code> to meet is that this Iterator should be ready to return the first Key-Value pair, or none if no such pair is available, when the method returns. The Key-Value pair would be returned by <code class="language-plaintext highlighter-rouge">getTopKey</code> and <code class="language-plaintext highlighter-rouge">getTopValue</code>, respectively, and <code class="language-plaintext highlighter-rouge">hasTop</code> should return a boolean denoting whether or not there is @@ -746,7 +746,7 @@ values will be grouped, insert mutations with those fields as the key, and confi the table with a combining iterator that supports the summarizing operation desired.</p> -<p>The only restriction on an combining iterator is that the combiner developer +<p>The only restriction on a combining iterator is that the combiner developer should not assume that all values for a given key have been seen, since new mutations can be inserted at anytime. This precludes using the total number of values in the aggregation such as when calculating an average, for example.</p> @@ -756,7 +756,7 @@ feature vectors for use in machine learning algorithms. For example, many algorithms such as k-means clustering, support vector machines, anomaly detection, etc. use the concept of a feature vector and the calculation of distance metrics to learn a particular model. The columns in an Accumulo table can be used to efficiently -store sparse features and their weights to be incrementally updated via the use of an +store sparse features and their weights to be incrementally updated via the use of a combining iterator.</p> <h2 id="best-practices">Best practices</h2> diff --git a/output/docs/2.x/development/sampling.html b/output/docs/2.x/development/sampling.html index 4037e9ae..2403edbc 100644 --- a/output/docs/2.x/development/sampling.html +++ b/output/docs/2.x/development/sampling.html @@ -438,7 +438,7 @@ placed in the sample data is configurable per table.</p> <p>This feature can be used for query estimation and optimization. For an example of estimation, assume an Accumulo table is configured to generate a sample -containing one millionth of a tables data. If a query is executed against the +containing one millionth of the table’s data. If a query is executed against the sample and returns one thousand results, then the same query against all the data would probably return a billion results. A nice property of having Accumulo generate the sample is that its always up to date. So estimations diff --git a/output/docs/2.x/development/summaries.html b/output/docs/2.x/development/summaries.html index ffa18490..8dffb526 100644 --- a/output/docs/2.x/development/summaries.html +++ b/output/docs/2.x/development/summaries.html @@ -433,7 +433,7 @@ <h2 id="overview">Overview</h2> <p>Accumulo has the ability to generate summary statistics about data in a table -using user defined functions. Currently these statistics are only generated for +using user defined functions. Currently, these statistics are only generated for data written to files. Data recently written to Accumulo that is still in memory will not contribute to summary statistics.</p> @@ -577,11 +577,11 @@ visibilities that were not counted.</p> tooMany = 4 </code></pre></div></div> -<p>Another summarizer is configured below that tracks the number of deletes. Also +<p>Another summarizer is configured below that tracks the number of deletes. Also, a compaction strategy that uses this summary data is configured. The <code class="language-plaintext highlighter-rouge">TooManyDeletesCompactionStrategy</code> will force a compaction of the tablet when the ratio of deletes to non-deletes is over 25%. This threshold is -configurable. Below a delete is added and its reflected in the statistics. In +configurable. Below a delete is added and it’s reflected in the statistics. In this case there is 1 delete and 10 non-deletes, not enough to force a compaction of the tablet.</p> diff --git a/output/docs/2.x/getting-started/clients.html b/output/docs/2.x/getting-started/clients.html index 0755f370..2b207966 100644 --- a/output/docs/2.x/getting-started/clients.html +++ b/output/docs/2.x/getting-started/clients.html @@ -483,7 +483,7 @@ of the tarball distribution): </li> </ol> -<p>If an <a href="/docs/2.x/configuration/files#accumulo-clientproperties">accumulo-client.properties</a> file or a Java Properties object is used to create a <a href="https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.1.0/org/apache/accumulo/core/client/AccumuloClient.html">AccumuloClient</a>, the following +<p>If an <a href="/docs/2.x/configuration/files#accumulo-clientproperties">accumulo-client.properties</a> file or a Java Properties object is used to create an <a href="https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.1.0/org/apache/accumulo/core/client/AccumuloClient.html">AccumuloClient</a>, the following <a href="/docs/2.x/configuration/client-properties">client properties</a> must be set:</p> <ul> @@ -521,7 +521,7 @@ of the tarball distribution): </tbody> </table> -<p>If a token class is used for <code class="language-plaintext highlighter-rouge">auth.type</code>, you can create create a Base64 encoded token using the <code class="language-plaintext highlighter-rouge">accumulo create-token</code> command.</p> +<p>If a token class is used for <code class="language-plaintext highlighter-rouge">auth.type</code>, you can create a Base64 encoded token using the <code class="language-plaintext highlighter-rouge">accumulo create-token</code> command.</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ accumulo create-token Username (aka principal): root @@ -533,7 +533,7 @@ auth.token = AAAAGh+LCAAAAAAAAAArTk0uSi0BAOXoolwGAAAA <h1 id="authentication">Authentication</h1> -<p>When creating a <a href="https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.1.0/org/apache/accumulo/core/client/AccumuloClient.html">AccumuloClient</a>, the user must be authenticated using one of the following +<p>When creating an <a href="https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.1.0/org/apache/accumulo/core/client/AccumuloClient.html">AccumuloClient</a>, the user must be authenticated using one of the following implementations of <a href="https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.1.0/org/apache/accumulo/core/client/security/tokens/AuthenticationToken.html">AuthenticationToken</a> below:</p> <ol> @@ -606,11 +606,11 @@ absence. For example a conditional mutation can require that column A is absent inorder to be applied. Iterators can be applied when checking conditions. Using iterators, many other operations besides equality and absence can be checked. For example, using an iterator that converts values -less than 5 to 0 and everything else to 1, its possible to only apply a +less than 5 to 0 and everything else to 1, it’s possible to only apply a mutation when a column is less than 5.</p> <p>In the case when a tablet server dies after a client sent a conditional -mutation, its not known if the mutation was applied or not. When this happens +mutation, it’s not known if the mutation was applied or not. When this happens the <a href="https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.1.0/org/apache/accumulo/core/client/ConditionalWriter.html">ConditionalWriter</a> reports a status of UNKNOWN for the ConditionalMutation. In many cases this situation can be dealt with by simply reading the row again and possibly sending another conditional mutation. If this is not sufficient, @@ -648,7 +648,7 @@ These levels are:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> root@uno> config -t mytable -s table.durability=sync </code></pre></div> </div> </li> - <li>When creating a <a href="https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.1.0/org/apache/accumulo/core/client/AccumuloClient.html">AccumuloClient</a>, the default durability can be overridden using <code class="language-plaintext highlighter-rouge">withBatchWriterConfig()</code> + <li>When creating an <a href="https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.1.0/org/apache/accumulo/core/client/AccumuloClient.html">AccumuloClient</a>, the default durability can be overridden using <code class="language-plaintext highlighter-rouge">withBatchWriterConfig()</code> or by setting <a href="/docs/2.x/configuration/client-properties#batch_writer_durability">batch.writer.durability</a> in <a href="/docs/2.x/configuration/files#accumulo-clientproperties">accumulo-client.properties</a>.</li> <li> <p>When a BatchWriter or ConditionalWriter is created, the durability settings above will be overridden @@ -721,7 +721,7 @@ columns, it is possible that you will only see two of those modifications. With the isolated scanner either all three of the changes are seen or none.</p> <p>The <a href="https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.1.0/org/apache/accumulo/core/client/IsolatedScanner.html">IsolatedScanner</a> buffers rows on the client side so a large row will not -crash a tablet server. By default rows are buffered in memory, but the user +crash a tablet server. By default, rows are buffered in memory, but the user can easily supply their own buffer if they wish to buffer to disk when rows are large.</p> diff --git a/output/docs/2.x/getting-started/design.html b/output/docs/2.x/getting-started/design.html index 7ee38f59..dd960687 100644 --- a/output/docs/2.x/getting-started/design.html +++ b/output/docs/2.x/getting-started/design.html @@ -569,7 +569,7 @@ ingest and query load is balanced across the cluster.</p> <p>When a write arrives at a TabletServer it is written to a Write-Ahead Log and then inserted into a sorted data structure in memory called a MemTable. When the MemTable reaches a certain size, the TabletServer writes out the sorted -key-value pairs to a file in HDFS called an <a href="#rfile">RFile</a>). This process is +key-value pairs to a file in HDFS called an <a href="#rfile">RFile</a>. This process is called a minor compaction. A new MemTable is then created and the fact of the compaction is recorded in the Write-Ahead Log.</p> @@ -608,7 +608,7 @@ for more information.</p> <h2 id="splitting">Splitting</h2> <p>When a table is created it has one tablet. As the table grows its initial -tablet eventually splits into two tablets. Its likely that one of these +tablet eventually splits into two tablets. It’s likely that one of these tablets will migrate to another tablet server. As the table continues to grow, its tablets will continue to split and be migrated. The decision to automatically split a tablet is based on the size of a tablets files. The diff --git a/output/docs/2.x/getting-started/features.html b/output/docs/2.x/getting-started/features.html index 4981f19e..94267461 100644 --- a/output/docs/2.x/getting-started/features.html +++ b/output/docs/2.x/getting-started/features.html @@ -463,7 +463,7 @@ as <code class="language-plaintext highlighter-rouge">(A&B)|C</code>) and au <p><a href="/docs/2.x/getting-started/table_configuration#constraints">Constraints</a> are configurable conditions where table writes are rejected. Constraints are written in Java and configurable -on a per table basis.</p> +on a per-table basis.</p> <h3 id="sharding">Sharding</h3> @@ -587,9 +587,9 @@ can become quite large. When the index is large, a lot of memory is consumed and files take a long time to open. To avoid this problem, RFiles have a multi-level index tree. Index blocks can point to other index blocks or data blocks. The entire index never has to be resident, even when the file is -written. When an index block exceeds the configurable size threshold, its +written. When an index block exceeds the configurable size threshold, it’s written out between data blocks. The size of index blocks is configurable on a -per table basis.</p> +per-table basis.</p> <h3 id="binary-search-in-rfile-blocks">Binary search in RFile blocks</h3> @@ -703,7 +703,7 @@ plugins in a stable manner.</p> <h3 id="balancer">Balancer</h3> <p>Users can provide a balancer plugin that decides how to distribute tablets -across a table. These plugins can be provided on a per table basis. This is +across a table. These plugins can be provided on a per-table basis. This is useful for ensuring a particular table’s tablets are placed optimally for tables with special query needs. The default balancer randomly spreads each table’s tablets across the cluster. It takes into account where a tablet was diff --git a/output/docs/2.x/getting-started/glossary.html b/output/docs/2.x/getting-started/glossary.html index 7f998314..75c2aa24 100644 --- a/output/docs/2.x/getting-started/glossary.html +++ b/output/docs/2.x/getting-started/glossary.html @@ -618,7 +618,7 @@ different servers over time.</p> <dt>timestamp</dt> <dd> <blockquote> - <p>the portion of the key that controls versioning. Otherwise identical keys + <p>the portion of the key that controls versioning. Otherwise, identical keys with differing timestamps are considered to be versions of a single <em>cell</em>. Accumulo can be configured to keep the <em>N</em> newest versions of each <em>cell</em>. When a deletion entry is inserted, it deletes diff --git a/output/docs/2.x/getting-started/quickstart.html b/output/docs/2.x/getting-started/quickstart.html index 8c4124fc..c8532a85 100644 --- a/output/docs/2.x/getting-started/quickstart.html +++ b/output/docs/2.x/getting-started/quickstart.html @@ -507,7 +507,7 @@ to <code class="language-plaintext highlighter-rouge">false</code>.</p> </li> <li> <p>Set <a href="/docs/2.x/configuration/server-properties#instance_volumes">instance.volumes</a> to HDFS location where Accumulo will store -data. If your namenode is running at 192.168.1.9:8020 and you want to store +data. If your namenode is running at 192.168.1.9:8020, and you want to store data in <code class="language-plaintext highlighter-rouge">/accumulo</code> in HDFS, then set <a href="/docs/2.x/configuration/server-properties#instance_volumes">instance.volumes</a> to <code class="language-plaintext highlighter-rouge">hdfs://192.168.1.9:8020/accumulo</code>.</p> </li> @@ -652,7 +652,7 @@ started.</p> <h3 id="run-individual-accumulo-services">Run individual Accumulo services</h3> -<p>Start individual Accumulo processes (tserver, master, monitor, etc) as a +<p>Start individual Accumulo processes (tserver, master, monitor, etc.) as a background service using the example accumulo-service script followed by the service name. For example, to start only the tserver, run:</p> @@ -699,7 +699,7 @@ all nodes where compactors should run.</li> </ul> <p>The Accumulo, Hadoop, and Zookeeper software should be present at the same -location on every node. Also the files in the <code class="language-plaintext highlighter-rouge">conf</code> directory must be copied to +location on every node. Also, the files in the <code class="language-plaintext highlighter-rouge">conf</code> directory must be copied to every node. There are many ways to replicate the software and configuration, two possible tools that can help replicate software and/or config are <a href="https://code.google.com/p/pdsh/">pdcp</a> and <a href="https://code.google.com/p/parallel-ssh/">prsync</a>.</p> diff --git a/output/docs/2.x/getting-started/table_configuration.html b/output/docs/2.x/getting-started/table_configuration.html index feabddfc..4bcb1fdf 100644 --- a/output/docs/2.x/getting-started/table_configuration.html +++ b/output/docs/2.x/getting-started/table_configuration.html @@ -503,7 +503,7 @@ com.test.ExampleConstraint=1 com.test.AnotherConstraint=2 </code></pre></div></div> -<p>Currently there are no general-purpose constraints provided with the Accumulo +<p>Currently, there are no general-purpose constraints provided with the Accumulo distribution. New constraints can be created by writing a Java class that implements the <a href="https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.1.0/org/apache/accumulo/core/constraints/Constraint.html">Constraint</a> interface.</p> @@ -616,9 +616,9 @@ user@myinstance mytable> config -t mytable -s table.iterator.minc.vers.opt.ma user@myinstance mytable> config -t mytable -s table.iterator.majc.vers.opt.maxVersions=3 </code></pre></div></div> -<p>When a table is created, by default its configured to use the +<p>When a table is created, by default it’s configured to use the VersioningIterator and keep one version. A table can be created without the -VersioningIterator with the -ndi option in the shell. Also the Java API +VersioningIterator with the -ndi option in the shell. Also, the Java API has the following method</p> <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">client</span><span class="o">.</span><span class="na">tableOperations</span><span class="o">.</span><span class="na">create</span><span class="o">(</span><span class="nc">String</span> <span class="n">tableName</span><span class="o">,</span> <span class="kt">boolean</span> <span class="n">limitVersion</span><span class="o">);</span> @@ -629,7 +629,7 @@ has the following method</p> <p>Accumulo 1.2 introduces the concept of logical time. This ensures that timestamps set by Accumulo always move forward. This helps avoid problems caused by TabletServers that have different time settings. The per tablet counter gives unique -one up time stamps on a per mutation basis. When using time in milliseconds, if +one up time stamps on a per-mutation basis. When using time in milliseconds, if two things arrive within the same millisecond then both receive the same timestamp. When using time in milliseconds, Accumulo set times will still always move forward and never backwards.</p> @@ -775,7 +775,7 @@ in reduced read latency. Read the <a href="/docs/2.x/administration/caching">Cac <p>Accumulo will balance and distribute tables across servers. Before a table gets large, it will be maintained as a single tablet on a single server. This limits the speed at which data can be added or queried -to the speed of a single node. To improve performance when the a table +to the speed of a single node. To improve performance when a table is new, or small, you can add split points and generate new tablets.</p> <p>In the shell:</p> @@ -882,7 +882,7 @@ flush option is present and is enabled by default in the shell. If the flush option is not enabled, then any data the source table currently has in memory will not exist in the clone.</p> -<p>A cloned table copies the configuration of the source table. However the +<p>A cloned table copies the configuration of the source table. However, the permissions of the source table are not copied to the clone. After a clone is created, only the user that created the clone can read and write to it.</p> @@ -957,7 +957,7 @@ root@a14 cic> <h2 id="exporting-tables">Exporting Tables</h2> <p>Accumulo supports exporting tables for the purpose of copying tables to another -cluster. Exporting and importing tables preserves the tables configuration, +cluster. Exporting and importing tables preserves the tables’ configuration, splits, and logical time. Tables are exported and then copied via the hadoop <code class="language-plaintext highlighter-rouge">distcp</code> command. To export a table, it must be offline and stay offline while <code class="language-plaintext highlighter-rouge">distcp</code> runs. Staying offline prevents files from being deleted during the process. diff --git a/output/docs/2.x/security/authorizations.html b/output/docs/2.x/security/authorizations.html index 3b4e26b1..2e85711c 100644 --- a/output/docs/2.x/security/authorizations.html +++ b/output/docs/2.x/security/authorizations.html @@ -450,7 +450,7 @@ preserving data confidentiality.</p> <h3 id="writing-labeled-data">Writing labeled data</h3> <p>When <a href="/docs/2.x/getting-started/clients#writing-data">writing data to Accumulo</a>, users can -specify a security label for each value by passing a [ColumnVisibilty] to the <a href="https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.1.0/org/apache/accumulo/core/data/Mutation.html">Mutation</a>.</p> +specify a security label for each value by passing a <a href="https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.1.0/org/apache/accumulo/core/security/ColumnVisibility.html">ColumnVisibility</a> to the <a href="https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.1.0/org/apache/accumulo/core/data/Mutation.html">Mutation</a>.</p> <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">try</span> <span class="o">(</span><span class="nc">BatchWriter</span> <span class="n">writer</span> <span class="o">=</span> <span class="n">client</span><span class="o">.</span><span class="na">createBatchWriter</span><span class="o">(</span><span class="s">"employees"</span><span class="o">))</span> <span class="o">{</span> <span class="nc">Mutation</span> <span class="n">mut</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">Mutation</span><span class="o">(</span><span class="s">"employee1"</span><span class="o">);</span> @@ -467,7 +467,7 @@ value the label is associated with. The set of tokens required can be specified syntax that supports logical AND <code class="language-plaintext highlighter-rouge">&</code> and OR <code class="language-plaintext highlighter-rouge">|</code> combinations of terms, as well as nesting groups <code class="language-plaintext highlighter-rouge">()</code> of terms together.</p> -<p>Each term is comprised of one to many alpha-numeric characters, hyphens, underscores or +<p>Each term is comprised of one to many alphanumeric characters, hyphens, underscores or periods. Optionally, each term may be wrapped in quotation marks which removes the restriction on valid characters. In quoted terms, quotation marks and backslash characters can be used as characters in the term by escaping them diff --git a/output/docs/2.x/security/kerberos.html b/output/docs/2.x/security/kerberos.html index 22996de3..b55abf53 100644 --- a/output/docs/2.x/security/kerberos.html +++ b/output/docs/2.x/security/kerberos.html @@ -448,13 +448,13 @@ which allow cross-language integration with Kerberos for authentication. GSSAPI, the generic security service application program interface, is a standard which Kerberos implements. In the Java programming language, the language itself also implements GSSAPI which is leveraged by other applications, like Apache Hadoop and Apache Thrift. -SASL, simple authentication and security layer, is a framework for authentication and +SASL, simple authentication and security layer, is a framework for authentication and security over the network. SASL provides a number of mechanisms for authentication, one of which is GSSAPI. Thus, SASL provides the transport which authenticates using GSSAPI that Kerberos implements.</p> <p>Kerberos is a very complicated software application and is deserving of much -more description than can be provided here. An <a href="https://www.roguelynn.com/words/explain-like-im-5-kerberos/">explain like I`m 5</a> +more description than can be provided here. An <a href="https://www.roguelynn.com/words/explain-like-im-5-kerberos/">explain like I’m 5</a> blog post is very good at distilling the basics, while <a href="https://web.mit.edu/kerberos/">MIT Kerberos’s project page</a> contains lots of documentation for users or administrators. Various Hadoop “vendors” also provide free documentation that includes step-by-step instructions for @@ -667,7 +667,7 @@ to granting Authorizations and Permissions to new users.</p> <h4 id="administrative-user">Administrative User</h4> <p>Out of the box (without Kerberos enabled), Accumulo has a single user with administrative permissions “root”. -This users is used to “bootstrap” other users, creating less-privileged users for applications using +This user is used to “bootstrap” other users, creating less-privileged users for applications using the system. In Kerberos, to authenticate with the system, it’s required that the client presents Kerberos credentials for the principal (user) the client is trying to authenticate as.</p> @@ -723,7 +723,7 @@ it can only connect to Accumulo as itself. Impersonation, in this context, refer of the proxy to authenticate to Accumulo as itself, but act on behalf of an Accumulo user.</p> <p>Accumulo supports basic impersonation of end-users by a third party via static rules in -<code class="language-plaintext highlighter-rouge">accumulo.properties</code>. These two properties are semi-colon separated properties which are aligned +<code class="language-plaintext highlighter-rouge">accumulo.properties</code>. These two properties are semicolon separated properties which are aligned by index. This first element in the user impersonation property value matches the first element in the host impersonation property value, etc.</p> @@ -886,7 +886,7 @@ JVM to each YARN task is secure, even in multi-tenant instances.</p> <h3 id="debugging">Debugging</h3> -<p><strong>Q</strong>: I have valid Kerberos credentials and a correct client configuration file but +<p><strong>Q</strong>: I have valid Kerberos credentials and a correct client configuration file, but I still get errors like:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] diff --git a/output/docs/2.x/security/on-disk-encryption.html b/output/docs/2.x/security/on-disk-encryption.html index c6e0f47b..a2e3ce1f 100644 --- a/output/docs/2.x/security/on-disk-encryption.html +++ b/output/docs/2.x/security/on-disk-encryption.html @@ -537,7 +537,7 @@ even with the crypto service enabled.</p> <p>For queries, data is decrypted when read from RFiles and cached in memory. This means that data is unencrypted in memory while Accumulo is running. Depending on the situation, this also means that some data can be printed to logs. A stacktrace being logged during an exception is one example. Accumulo developers have made sure not to expose data protected by authorizations during logging, but -its the additional data that gets encrypted on-disk that could be exposed in a log file.</p> +it’s the additional data that gets encrypted on-disk that could be exposed in a log file.</p> <h4 id="bulk-import">Bulk Import</h4> diff --git a/output/docs/2.x/security/wire-encryption.html b/output/docs/2.x/security/wire-encryption.html index e4089835..f26bc7cb 100644 --- a/output/docs/2.x/security/wire-encryption.html +++ b/output/docs/2.x/security/wire-encryption.html @@ -549,7 +549,7 @@ keytool <span class="nt">-import</span> <span class="nt">-trustcacerts</span> <s </code></pre></div></div> <p>The <code class="language-plaintext highlighter-rouge">server.jks</code> file is the Java keystore containing the certificate for a given host. The above -methods are equivalent whether the certificate is generate for an Accumulo server or a client.</p> +methods are equivalent whether the certificate is generated for an Accumulo server or a client.</p> diff --git a/output/docs/2.x/troubleshooting/advanced.html b/output/docs/2.x/troubleshooting/advanced.html index f547ec99..bd067d17 100644 --- a/output/docs/2.x/troubleshooting/advanced.html +++ b/output/docs/2.x/troubleshooting/advanced.html @@ -437,7 +437,7 @@ <p>The primary reason a tablet server loses its lock is that it has been pushed into swap.</p> <p>A large java program (like the tablet server) may have a large portion -of its memory image unused. The operation system will favor pushing +of its memory image unused. The operating system will favor pushing this allocated, but unused memory into swap so that the memory can be re-used as a disk buffer. When the java virtual machine decides to access this memory, the OS will begin flushing disk buffers to return that @@ -545,7 +545,7 @@ due to the JVM running out of memory, or being swapped out to disk.</p> <p><strong>I had disastrous HDFS failure. After bringing everything back up, several tablets refuse to go online.</strong></p> <p>Data written to tablets is written into memory before being written into indexed files. In case the server -is lost before the data is saved into a an indexed file, all data stored in memory is first written into a +is lost before the data is saved into an indexed file, all data stored in memory is first written into a write-ahead log (WAL). When a tablet is re-assigned to a new tablet server, the write-ahead logs are read to recover any mutations that were in memory when the tablet was last hosted.</p> @@ -630,7 +630,7 @@ reliably just remove those references.</p> </code></pre></div></div> <p>If files under <code class="language-plaintext highlighter-rouge">/accumulo/tables</code> are corrupt, the best course of action is to -recover those files in hdsf see the section on HDFS. Once these recovery efforts +recover those files in hdfs see the section on HDFS. Once these recovery efforts have been exhausted, the next step depends on where the missing file(s) are located. Different actions are required when the bad files are in Accumulo data table files or if they are metadata table files.</p> @@ -669,7 +669,7 @@ before creating the new instance. You will not be able to use RestoreZookeeper because the table names and references are likely to be different between the original and the new instances, but it can serve as a reference.</p> -<p>If the files cannot be recovered, replace corrupt data files with a empty +<p>If the files cannot be recovered, replace corrupt data files with an empty rfiles to allow references in the metadata table and in the tablet servers to be resolved. Rebuild the metadata table if the corrupt files are metadata files.</p> @@ -682,7 +682,7 @@ WAL file, never being able to succeed.</p> <p>In the cases where the WAL file’s original contents are unrecoverable or some degree of data loss is acceptable (beware if the WAL file contains updates to the Accumulo -metadata table!), the following process can be followed to create an valid, empty +metadata table!), the following process can be followed to create a valid, empty WAL file. Run the following commands as the Accumulo unix user (to ensure that the proper file permissions in HDFS)</p> diff --git a/output/docs/2.x/troubleshooting/basic.html b/output/docs/2.x/troubleshooting/basic.html index 774b6b1e..7c2721e1 100644 --- a/output/docs/2.x/troubleshooting/basic.html +++ b/output/docs/2.x/troubleshooting/basic.html @@ -440,7 +440,7 @@ these remote computers writes down events as they occur, into a local file. By default, this is defined in <code class="language-plaintext highlighter-rouge">conf/accumulo-env.sh</code> as <code class="language-plaintext highlighter-rouge">ACCUMULO_LOG_DIR</code>. Look in the <code class="language-plaintext highlighter-rouge">$ACCUMULO_LOG_DIR/tserver*.log</code> file. Specifically, check the end of the file.</p> -<p><strong>The tablet server did not start and the debug log does not exists! What happened?</strong></p> +<p><strong>The tablet server did not start and the debug log does not exist! What happened?</strong></p> <p>When the individual programs are started, the stdout and stderr output of these programs are stored in <code class="language-plaintext highlighter-rouge">.out</code> and <code class="language-plaintext highlighter-rouge">.err</code> files in @@ -558,7 +558,7 @@ $ accumulo-service tserver start <p><strong>My process died again. Should I restart it via <code class="language-plaintext highlighter-rouge">cron</code> or tools like <code class="language-plaintext highlighter-rouge">supervisord</code>?</strong></p> -<p>A repeatedly dying Accumulo process is a sign of a larger problem. Typically these problems are due to a +<p>A repeatedly dying Accumulo process is a sign of a larger problem. Typically, these problems are due to a misconfiguration of Accumulo or over-saturation of resources. Blind automation of any service restart inside of Accumulo is generally an undesirable situation as it is indicative of a problem that is being masked and ignored. Accumulo processes should be stable on the order of months and not require frequent restart.</p> @@ -576,7 +576,7 @@ processes should be stable on the order of months and not require frequent resta of the visibilities in the underlying data.</p> <p>Note that the use of <code class="language-plaintext highlighter-rouge">rfile-info</code> is an administrative tool and can only -by used by someone who can access the underlying Accumulo data. It +be used by someone who can access the underlying Accumulo data. It does not provide the normal access controls in Accumulo.</p> <h2 id="ingest">Ingest</h2> diff --git a/output/docs/2.x/troubleshooting/system-metadata-tables.html b/output/docs/2.x/troubleshooting/system-metadata-tables.html index f969a6c0..e9402903 100644 --- a/output/docs/2.x/troubleshooting/system-metadata-tables.html +++ b/output/docs/2.x/troubleshooting/system-metadata-tables.html @@ -509,7 +509,7 @@ shell> scan -b 3; -e 3< </li> <li> <p><code class="language-plaintext highlighter-rouge">srv:time [] M1373998392323</code> - - This indicates the time time type (<code class="language-plaintext highlighter-rouge">M</code> for milliseconds or <code class="language-plaintext highlighter-rouge">L</code> for logical) and the timestamp of the most recently written key in this tablet. It is used to ensure automatically assigned key timestamps are strictly increasing for the tablet, regardless of the tablet server’s system time.</p> + This indicates the time type (<code class="language-plaintext highlighter-rouge">M</code> for milliseconds or <code class="language-plaintext highlighter-rouge">L</code> for logical) and the timestamp of the most recently written key in this tablet. It is used to ensure automatically assigned key timestamps are strictly increasing for the tablet, regardless of the tablet server’s system time.</p> </li> <li> <p><code class="language-plaintext highlighter-rouge">~tab:~pr [] \x00</code> - diff --git a/output/docs/2.x/troubleshooting/tools.html b/output/docs/2.x/troubleshooting/tools.html index 6240e8a4..44a017f9 100644 --- a/output/docs/2.x/troubleshooting/tools.html +++ b/output/docs/2.x/troubleshooting/tools.html @@ -585,7 +585,7 @@ No problems found in accumulo.metadata (!0) <h2 id="removeentriesformissingfiles">RemoveEntriesForMissingFiles</h2> -<p>If your Hadoop cluster has a lost a file due to a NameNode failure, you can remove the +<p>If your Hadoop cluster has a lost a file due to a NameNode failure, you can remove the file reference using <code class="language-plaintext highlighter-rouge">RemoveEntriesForMissingFiles</code>. It will check every file reference and ensure that the file exists in HDFS. Optionally, it will remove the reference:</p> @@ -764,7 +764,7 @@ Table ids: <h2 id="mode-print-property-mappings">mode: print property mappings</h2> -<p>With Accumulo version 2.1, the storage of properties in ZooKeeper has changed and the properties are not direclty +<p>With Accumulo version 2.1, the storage of properties in ZooKeeper has changed and the properties are not directly readable with the ZooKeeper zkCli utility. The properties can be listed in an Accumulo shell with the <code class="language-plaintext highlighter-rouge">config</code> command. However, if a shell is not available, this utility <code class="language-plaintext highlighter-rouge">zoo-info-viewer</code> can be used instead.</p> diff --git a/output/feed.xml b/output/feed.xml index f6f91c93..b9583d24 100644 --- a/output/feed.xml +++ b/output/feed.xml @@ -6,8 +6,8 @@ </description> <link>https://accumulo.apache.org/</link> <atom:link href="https://accumulo.apache.org/feed.xml" rel="self" type="application/rss+xml"/> - <pubDate>Wed, 02 Nov 2022 15:14:11 +0000</pubDate> - <lastBuildDate>Wed, 02 Nov 2022 15:14:11 +0000</lastBuildDate> + <pubDate>Wed, 02 Nov 2022 21:26:29 +0000</pubDate> + <lastBuildDate>Wed, 02 Nov 2022 21:26:29 +0000</lastBuildDate> <generator>Jekyll v4.3.1</generator> diff --git a/output/search_data.json b/output/search_data.json index f3efba73..9ec15a4e 100644 --- a/output/search_data.json +++ b/output/search_data.json @@ -2,14 +2,14 @@ "docs-2-x-administration-caching": { "title": "Caching", - "content": "Accumulo tablet servers have block caches that buffer data in memory to limit reads from disk.This caching has the following benefits: reduces latency when reading data helps alleviate hotspots in tablesEach tablet server has an index and data block cache that is shared by all hosted tablets (see the tablet server diagramto learn more). A typical Accumulo read operation will perform a binary search over several index blocks followed by a linear scanof one or more data [...] + "content": "Accumulo tablet servers have block caches that buffer data in memory to limit reads from disk.This caching has the following benefits: reduces latency when reading data helps alleviate hotspots in tablesEach tablet server has an index and data block cache that is shared by all hosted tablets (see the tablet server diagramto learn more). A typical Accumulo read operation will perform a binary search over several index blocks followed by a linear scanof one or more data [...] "url": " /docs/2.x/administration/caching", "categories": "administration" }, "docs-2-x-administration-compaction": { "title": "Compactions", - "content": "In Accumulo each tablet has a list of files associated with it. As data iswritten to Accumulo it is buffered in memory. The data buffered in memory iseventually written to files in DFS on a per tablet basis. Files can also beadded to tablets directly by bulk import. In the background tablet servers runmajor compactions to merge multiple files into one. The tablet server has todecide which tablets to compact and which files within a tablet to compact.Within each tablet [...] + "content": "In Accumulo each tablet has a list of files associated with it. As data iswritten to Accumulo it is buffered in memory. The data buffered in memory iseventually written to files in DFS on a per-tablet basis. Files can also beadded to tablets directly by bulk import. In the background tablet servers runmajor compactions to merge multiple files into one. The tablet server has todecide which tablets to compact and which files within a tablet to compact.Within each tablet [...] "url": " /docs/2.x/administration/compaction", "categories": "administration" }, @@ -23,14 +23,14 @@ "docs-2-x-administration-fate": { "title": "FATE", - "content": "Accumulo must implement a number of distributed, multi-step operations to supportthe client API. Creating a new table is a simple example of an atomic client callwhich requires multiple steps in the implementation: get a unique table ID, configuredefault table permissions, populate information in ZooKeeper to record the table’sexistence, create directories in HDFS for the table’s data, etc. Implementing thesesteps in a way that is tolerant to node failure and other conc [...] + "content": "Accumulo must implement a number of distributed, multi-step operations to supportthe client API. Creating a new table is a simple example of an atomic client callwhich requires multiple steps in the implementation: get a unique table ID, configuredefault table permissions, populate information in ZooKeeper to record the table’sexistence, create directories in HDFS for the table’s data, etc. Implementing thesesteps in a way that is tolerant to node failure and other conc [...] "url": " /docs/2.x/administration/fate", "categories": "administration" }, "docs-2-x-administration-in-depth-install": { "title": "In-depth Installation", - "content": "This document provides detailed instructions for installing Accumulo. For basicinstructions, see the quick start.HardwareBecause we are running essentially two or three systems simultaneously layeredacross the cluster: HDFS, Accumulo and MapReduce, it is typical for hardware toconsist of 4 to 8 cores, and 8 to 32 GB RAM. This is so each running process can haveat least one core and 2 - 4 GB each.One core running HDFS can typically keep 2 to 4 disks busy, so each machine [...] + "content": "This document provides detailed instructions for installing Accumulo. For basicinstructions, see the quick start.HardwareBecause we are running essentially two or three systems simultaneously layeredacross the cluster: HDFS, Accumulo and MapReduce, it is typical for hardware toconsist of 4 to 8 cores, and 8 to 32 GB RAM. This is so each running process can haveat least one core and 2 - 4 GB each.One core running HDFS can typically keep 2 to 4 disks busy, so each machine [...] "url": " /docs/2.x/administration/in-depth-install", "categories": "administration" }, @@ -44,28 +44,28 @@ "docs-2-x-administration-multivolume": { "title": "Multi-Volume Installations", - "content": "This is an advanced configuration setting for very large clustersunder a lot of write pressure.The HDFS NameNode holds all of the metadata about the files inHDFS. For fast performance, all of this information needs to be storedin memory. A single NameNode with 64G of memory can store themetadata for tens of millions of files. However, when scaling beyond athousand nodes, an active Accumulo system can generate lots of updatesto the file system, especially when data is b [...] + "content": "This is an advanced configuration setting for very large clustersunder a lot of write pressure.The HDFS NameNode holds all of the metadata about the files inHDFS. For fast performance, all of this information needs to be storedin memory. A single NameNode with 64G of memory can store themetadata for tens of millions of files. However, when scaling beyond athousand nodes, an active Accumulo system can generate lots of updatesto the file system, especially when data is b [...] "url": " /docs/2.x/administration/multivolume", "categories": "administration" }, "docs-2-x-administration-replication": { "title": "Replication", - "content": "OverviewReplication is a feature of Accumulo which provides a mechanism to automaticallycopy data to other systems, typically for the purpose of disaster recovery,high availability, or geographic locality. It is best to consider this featureas a framework for automatic replication instead of the ability to copy datafrom to another Accumulo instance as copying to another Accumulo cluster isonly an implementation detail. The local Accumulo cluster is hereby referredto as [...] + "content": "OverviewReplication is a feature of Accumulo which provides a mechanism to automaticallycopy data to other systems, typically for the purpose of disaster recovery,high availability, or geographic locality. It is best to consider this featureas a framework for automatic replication instead of the ability to copy datafrom to another Accumulo instance as copying to another Accumulo cluster isonly an implementation detail. The local Accumulo cluster is hereby referredto as [...] "url": " /docs/2.x/administration/replication", "categories": "administration" }, "docs-2-x-administration-scan-executors": { "title": "Scan Executors", - "content": "Accumulo scans operate by repeatedly fetching batches of data from a tabletserver. On the tablet server side, a thread pool fetches batches.In Java threads pools are called executors. By default, a single executor pertablet server handles all scans in FIFO order. For some workloads, the singleFIFO executor is suboptimal. For example, consider many unimportant scansreading lots of data mixed with a few important scans reading small amounts ofdata. The long scans not [...] + "content": "Accumulo scans operate by repeatedly fetching batches of data from a tabletserver. On the tablet server side, a thread pool fetches batches.In Java threads pools are called executors. By default, a single executor pertablet server handles all scans in FIFO order. For some workloads, the singleFIFO executor is suboptimal. For example, consider many unimportant scansreading lots of data mixed with a few important scans reading small amounts ofdata. The long scans not [...] "url": " /docs/2.x/administration/scan-executors", "categories": "administration" }, "docs-2-x-administration-upgrading": { "title": "Upgrading Accumulo", - "content": "Upgrading from 1.10.x or 2.0.x to 2.1The recommended way to upgrade from a prior 1.10.x or 2.0.x release is to stop Accumulo, upgradeto 2.1 and then start 2.1. To upgrade from a release prior to 1.10, follow thebelow steps to upgrade to 2.0 and then perform the upgrade to 2.1. Adirect upgrade from releases prior to 1.10 has not been tested.Rename master Properties, Config Files, and Script ReferencesAlthough not required until at least release 3.0, it is strongly recomm [...] + "content": "Upgrading from 1.10.x or 2.0.x to 2.1The recommended way to upgrade from a prior 1.10.x or 2.0.x release is to stop Accumulo, upgradeto 2.1 and then start 2.1. To upgrade from a release prior to 1.10, follow thebelow steps to upgrade to 2.0 and then perform the upgrade to 2.1. Adirect upgrade from releases prior to 1.10 has not been tested.Rename master Properties, Config Files, and Script ReferencesAlthough not required until at least release 3.0, it is strongly recomm [...] "url": " /docs/2.x/administration/upgrading", "categories": "administration" }, @@ -79,21 +79,21 @@ "docs-2-x-configuration-files": { "title": "Configuration Files", - "content": "Accumulo has the following configuration files which can be found in theconf/ directory of the Accumulo release tarball.accumulo.propertiesThe accumulo.properties file configures Accumulo server processes usingserver properties. This file can be found in the conf/direcory. It is needed on every host that runs Accumulo processes. Therfore, any configuration should bereplicated to all hosts of the Accumulo cluster. If a property is not configured here, it might have beenc [...] + "content": "Accumulo has the following configuration files which can be found in theconf/ directory of the Accumulo release tarball.accumulo.propertiesThe accumulo.properties file configures Accumulo server processes usingserver properties. This file can be found in the conf/directory. It is needed on every host that runs Accumulo processes. Therefore, any configuration should bereplicated to all hosts of the Accumulo cluster. If a property is not configured here, it might have bee [...] "url": " /docs/2.x/configuration/files", "categories": "configuration" }, "docs-2-x-configuration-overview": { "title": "Configuration Overview", - "content": "Configuration is managed differently for Accumulo clients and servers.Client ConfigurationAccumulo clients are created using Java builder methods, a Java properties object or anaccumulo-client.properties file containing client properties.Server ConfigurationAccumulo processes (i.e manager, tablet server, monitor, etc) are configured by server properties whose values can beset in the following configuration locations (with increasing precedence): Default - All propertie [...] + "content": "Configuration is managed differently for Accumulo clients and servers.Client ConfigurationAccumulo clients are created using Java builder methods, a Java properties object or anaccumulo-client.properties file containing client properties.Server ConfigurationAccumulo processes (i.e. manager, tablet server, monitor, etc.) are configured by server properties whose values can beset in the following configuration locations (with increasing precedence): Default - All propert [...] "url": " /docs/2.x/configuration/overview", "categories": "configuration" }, "docs-2-x-configuration-server-properties": { "title": "Server Properties", - "content": "Below are properties set in accumulo.properties or the Accumulo shell that configure Accumulo servers (i.e tablet server, manager, etc). Properties labeled ‘Experimental’ should not be considered stable and have a higher risk of changing in the future. Property Description compaction.coordinator.* ExperimentalAvailable since: 2.1.0Properties in this category affect the behavior of the accumulo compaction coordinator server. [...] + "content": "Below are properties set in accumulo.properties or the Accumulo shell that configure Accumulo servers (i.e. tablet server, manager, etc). Properties labeled ‘Experimental’ should not be considered stable and have a higher risk of changing in the future. Property Description compaction.coordinator.* ExperimentalAvailable since: 2.1.0Properties in this category affect the behavior of the accumulo compaction coordinator server. [...] "url": " /docs/2.x/configuration/server-properties", "categories": "configuration" }, @@ -114,7 +114,7 @@ "docs-2-x-development-iterators": { "title": "Iterators", - "content": "Accumulo SortedKeyValueIterators, commonly referred to as Iterators for short, are server-side programming constructsthat allow users to implement custom retrieval or computational purpose within Accumulo TabletServers. The name rightlybrings forward similarities to the Java Iterator interface; however, Accumulo Iterators are more complex than JavaIterators. Notably, in addition to the expected methods to retrieve the current element and advance to the next elementin t [...] + "content": "Accumulo SortedKeyValueIterators, commonly referred to as Iterators for short, are server-side programming constructsthat allow users to implement custom retrieval or computational purpose within Accumulo TabletServers. The name rightlybrings forward similarities to the Java Iterator interface; however, Accumulo Iterators are more complex than JavaIterators. Notably, in addition to the expected methods to retrieve the current element and advance to the next elementin t [...] "url": " /docs/2.x/development/iterators", "categories": "development" }, @@ -135,7 +135,7 @@ "docs-2-x-development-sampling": { "title": "Sampling", - "content": "OverviewAccumulo has the ability to generate and scan a per table set of sample data.This sample data is kept up to date as a table is mutated. What key values areplaced in the sample data is configurable per table.This feature can be used for query estimation and optimization. For an exampleof estimation, assume an Accumulo table is configured to generate a samplecontaining one millionth of a tables data. If a query is executed against thesample and returns one tho [...] + "content": "OverviewAccumulo has the ability to generate and scan a per table set of sample data.This sample data is kept up to date as a table is mutated. What key values areplaced in the sample data is configurable per table.This feature can be used for query estimation and optimization. For an exampleof estimation, assume an Accumulo table is configured to generate a samplecontaining one millionth of the table’s data. If a query is executed against thesample and returns one th [...] "url": " /docs/2.x/development/sampling", "categories": "development" }, @@ -149,42 +149,42 @@ "docs-2-x-development-summaries": { "title": "Summary Statistics", - "content": "OverviewAccumulo has the ability to generate summary statistics about data in a tableusing user defined functions. Currently these statistics are only generated fordata written to files. Data recently written to Accumulo that is still inmemory will not contribute to summary statistics.This feature can be used to inform a user about what data is in their table.Summary statistics can also be used by compaction strategies to make decisionsabout which files to compact.Sum [...] + "content": "OverviewAccumulo has the ability to generate summary statistics about data in a tableusing user defined functions. Currently, these statistics are only generated fordata written to files. Data recently written to Accumulo that is still inmemory will not contribute to summary statistics.This feature can be used to inform a user about what data is in their table.Summary statistics can also be used by compaction strategies to make decisionsabout which files to compact.Su [...] "url": " /docs/2.x/development/summaries", "categories": "development" }, "docs-2-x-getting-started-clients": { "title": "Accumulo Clients", - "content": "Creating Client CodeIf you are using Maven to create Accumulo client code, add the following dependency to your pom:&lt;dependency&gt; &lt;groupId&gt;org.apache.accumulo&lt;/groupId&gt; &lt;artifactId&gt;accumulo-core&lt;/artifactId&gt; &lt;version&gt;2.1.0&lt;/version&gt;&lt;/dependency&gt;When writing code that uses Accumulo, only use the Accumulo Public API.The accumulo-core artifact includes implemen [...] + "content": "Creating Client CodeIf you are using Maven to create Accumulo client code, add the following dependency to your pom:&lt;dependency&gt; &lt;groupId&gt;org.apache.accumulo&lt;/groupId&gt; &lt;artifactId&gt;accumulo-core&lt;/artifactId&gt; &lt;version&gt;2.1.0&lt;/version&gt;&lt;/dependency&gt;When writing code that uses Accumulo, only use the Accumulo Public API.The accumulo-core artifact includes implemen [...] "url": " /docs/2.x/getting-started/clients", "categories": "getting-started" }, "docs-2-x-getting-started-design": { "title": "Design", - "content": "BackgroundThe design of Apache Accumulo is inspired by Google’s BigTable paper.Data ModelAccumulo provides a richer data model than simple key-value stores, but is not afully relational database. Data is represented as key-value pairs, where the key andvalue are comprised of the following elements:All elements of the Key and the Value are represented as byte arrays except forTimestamp, which is a Long. Accumulo sorts keys by element and lexicographicallyin ascending ord [...] + "content": "BackgroundThe design of Apache Accumulo is inspired by Google’s BigTable paper.Data ModelAccumulo provides a richer data model than simple key-value stores, but is not afully relational database. Data is represented as key-value pairs, where the key andvalue are comprised of the following elements:All elements of the Key and the Value are represented as byte arrays except forTimestamp, which is a Long. Accumulo sorts keys by element and lexicographicallyin ascending ord [...] "url": " /docs/2.x/getting-started/design", "categories": "getting-started" }, "docs-2-x-getting-started-features": { "title": "Features", - "content": " Table Design and Configuration Integrity/Availability Performance Testing Client API Plugins General Administration Internal Data Management On-demand Data ManagementTable Design and ConfigurationIteratorsIterators are server-side programming mechanisms that encode functions such as filtering andaggregation within the data management steps (scopes where data is read from orwritten to disk) that happen in the tablet server.Security labelsAccumulo Keys can conta [...] + "content": " Table Design and Configuration Integrity/Availability Performance Testing Client API Plugins General Administration Internal Data Management On-demand Data ManagementTable Design and ConfigurationIteratorsIterators are server-side programming mechanisms that encode functions such as filtering andaggregation within the data management steps (scopes where data is read from orwritten to disk) that happen in the tablet server.Security labelsAccumulo Keys can conta [...] "url": " /docs/2.x/getting-started/features", "categories": "getting-started" }, "docs-2-x-getting-started-glossary": { "title": "Glossary", - "content": " authorizations a set of strings associated with a user or with a particular scan that willbe used to determine which key/value pairs are visible to the user. cell a set of key/value pairs whose keys differ only in timestamp. column the portion of the key that sorts after the row and is divided into family,qualifier, and visibility. column family the portion of the key that sorts second and controls local [...] + "content": " authorizations a set of strings associated with a user or with a particular scan that willbe used to determine which key/value pairs are visible to the user. cell a set of key/value pairs whose keys differ only in timestamp. column the portion of the key that sorts after the row and is divided into family,qualifier, and visibility. column family the portion of the key that sorts second and controls local [...] "url": " /docs/2.x/getting-started/glossary", "categories": "getting-started" }, "docs-2-x-getting-started-quickstart": { "title": "Setup", - "content": "User Manual (2.x)Starting with Accumulo 2.0, the user manual now lives on the website as a seriesof web pages. Previously, it was one large pdf document that was only generatedduring a release. The user manual can now be updated very quickly and indexedfor searching across many webpages.The manual can now be searched using the Search link at the top of thewebsite or navigated by clicking the links to the left. If you are new toAccumulo, follow the instructions below to [...] + "content": "User Manual (2.x)Starting with Accumulo 2.0, the user manual now lives on the website as a seriesof web pages. Previously, it was one large pdf document that was only generatedduring a release. The user manual can now be updated very quickly and indexedfor searching across many webpages.The manual can now be searched using the Search link at the top of thewebsite or navigated by clicking the links to the left. If you are new toAccumulo, follow the instructions below to [...] "url": " /docs/2.x/getting-started/quickstart", "categories": "getting-started" }, @@ -198,7 +198,7 @@ "docs-2-x-getting-started-table-configuration": { "title": "Table Configuration", - "content": "Accumulo tables have a few options that can be configured to alter the defaultbehavior of Accumulo as well as improve performance based on the data stored.These include locality groups, constraints, bloom filters, iterators, and blockcache. See the server properties documentation for a complete list of availableconfiguration options.Locality GroupsAccumulo supports storing sets of column families separately on disk to allowclients to efficiently scan over columns that [...] + "content": "Accumulo tables have a few options that can be configured to alter the defaultbehavior of Accumulo as well as improve performance based on the data stored.These include locality groups, constraints, bloom filters, iterators, and blockcache. See the server properties documentation for a complete list of availableconfiguration options.Locality GroupsAccumulo supports storing sets of column families separately on disk to allowclients to efficiently scan over columns that [...] "url": " /docs/2.x/getting-started/table_configuration", "categories": "getting-started" }, @@ -226,21 +226,21 @@ "docs-2-x-security-authorizations": { "title": "Authorizations", - "content": "In Accumulo, data is written with security labels that limit access to only users with the properauthorizations.ConfigurationAccumulo’s Authorizor is configured by setting instance.security.authorizor. The defaultauthorizor is the ZKAuthorizor which is describedbelow.Security LabelsEvery Key-Value pair in Accumulo has its own security label, stored under the column visibilityelement of the key, which is used to determine whether a given user meets the securityrequiremen [...] + "content": "In Accumulo, data is written with security labels that limit access to only users with the properauthorizations.ConfigurationAccumulo’s Authorizor is configured by setting instance.security.authorizor. The defaultauthorizor is the ZKAuthorizor which is describedbelow.Security LabelsEvery Key-Value pair in Accumulo has its own security label, stored under the column visibilityelement of the key, which is used to determine whether a given user meets the securityrequiremen [...] "url": " /docs/2.x/security/authorizations", "categories": "security" }, "docs-2-x-security-kerberos": { "title": "Kerberos", - "content": "OverviewKerberos is a network authentication protocol that provides a secure way forpeers to prove their identity over an unsecure network in a client-server model.A centralized key-distribution center (KDC) is the service that coordinatesauthentication between a client and a server. Clients and servers use “tickets”,obtained from the KDC via a password or a special file called a “keytab”, tocommunicate with the KDC and prove their identity. A KDC administrator mustcrea [...] + "content": "OverviewKerberos is a network authentication protocol that provides a secure way forpeers to prove their identity over an unsecure network in a client-server model.A centralized key-distribution center (KDC) is the service that coordinatesauthentication between a client and a server. Clients and servers use “tickets”,obtained from the KDC via a password or a special file called a “keytab”, tocommunicate with the KDC and prove their identity. A KDC administrator mustcrea [...] "url": " /docs/2.x/security/kerberos", "categories": "security" }, "docs-2-x-security-on-disk-encryption": { "title": "On Disk Encryption", - "content": "For an additional layer of security, Accumulo can encrypt files stored on-disk. On Disk encryption was reworkedfor 2.0, making it easier to configure and more secure. Starting with 2.1, On Disk Encryption can now be configuredper table as well as for the entire instance (all tables). The files that can be encrypted include: RFiles and Write AheadLogs (WALs). NOTE: This feature is considered experimental and upgrading a previously encrypted instanceis not supported. Fo [...] + "content": "For an additional layer of security, Accumulo can encrypt files stored on-disk. On Disk encryption was reworkedfor 2.0, making it easier to configure and more secure. Starting with 2.1, On Disk Encryption can now be configuredper table as well as for the entire instance (all tables). The files that can be encrypted include: RFiles and Write AheadLogs (WALs). NOTE: This feature is considered experimental and upgrading a previously encrypted instanceis not supported. Fo [...] "url": " /docs/2.x/security/on-disk-encryption", "categories": "security" }, @@ -261,21 +261,21 @@ "docs-2-x-security-wire-encryption": { "title": "Wire Encryption", - "content": "Accumulo, through Thrift’s TSSLTransport, provides the ability to encryptwire communication between Accumulo servers and clients using securesockets layer (SSL). SSL certificates signed by the same certificate authoritycontrol the “circle of trust” in which a secure connection can be established.Typically, each host running Accumulo processes would be given a certificatewhich identifies itself.Clients can optionally also be given a certificate, when client-auth is enabl [...] + "content": "Accumulo, through Thrift’s TSSLTransport, provides the ability to encryptwire communication between Accumulo servers and clients using securesockets layer (SSL). SSL certificates signed by the same certificate authoritycontrol the “circle of trust” in which a secure connection can be established.Typically, each host running Accumulo processes would be given a certificatewhich identifies itself.Clients can optionally also be given a certificate, when client-auth is enabl [...] "url": " /docs/2.x/security/wire-encryption", "categories": "security" }, "docs-2-x-troubleshooting-advanced": { "title": "Advanced Troubleshooting", - "content": "Tablet server locksMy tablet server lost its lock. Why?The primary reason a tablet server loses its lock is that it has been pushed into swap.A large java program (like the tablet server) may have a large portionof its memory image unused. The operation system will favor pushingthis allocated, but unused memory into swap so that the memory can bere-used as a disk buffer. When the java virtual machine decides toaccess this memory, the OS will begin flushing disk buffe [...] + "content": "Tablet server locksMy tablet server lost its lock. Why?The primary reason a tablet server loses its lock is that it has been pushed into swap.A large java program (like the tablet server) may have a large portionof its memory image unused. The operating system will favor pushingthis allocated, but unused memory into swap so that the memory can bere-used as a disk buffer. When the java virtual machine decides toaccess this memory, the OS will begin flushing disk buffe [...] "url": " /docs/2.x/troubleshooting/advanced", "categories": "troubleshooting" }, "docs-2-x-troubleshooting-basic": { "title": "Basic Troubleshooting", - "content": "GeneralThe tablet server does not seem to be running!? What happened?Accumulo is a distributed system. It is supposed to run on remoteequipment, across hundreds of computers. Each program that runs onthese remote computers writes down events as they occur, into a localfile. By default, this is defined in conf/accumulo-env.sh as ACCUMULO_LOG_DIR.Look in the $ACCUMULO_LOG_DIR/tserver*.log file. Specifically, check the end of the file.The tablet server did not start and [...] + "content": "GeneralThe tablet server does not seem to be running!? What happened?Accumulo is a distributed system. It is supposed to run on remoteequipment, across hundreds of computers. Each program that runs onthese remote computers writes down events as they occur, into a localfile. By default, this is defined in conf/accumulo-env.sh as ACCUMULO_LOG_DIR.Look in the $ACCUMULO_LOG_DIR/tserver*.log file. Specifically, check the end of the file.The tablet server did not start and [...] "url": " /docs/2.x/troubleshooting/basic", "categories": "troubleshooting" }, @@ -289,14 +289,14 @@ "docs-2-x-troubleshooting-system-metadata-tables": { "title": "System Metadata Tables", - "content": "Accumulo tracks information about tables in metadata tables. The metadata formost tables is contained within the metadata table in the accumulo namespace,while metadata for that table is contained in the root table in the accumulonamespace. The root table is composed of a single tablet, which does notsplit, so it is also called the root tablet. Information about the roottable, such as its location and write-ahead logs, are stored in ZooKeeper.Let’s create a table and pu [...] + "content": "Accumulo tracks information about tables in metadata tables. The metadata formost tables is contained within the metadata table in the accumulo namespace,while metadata for that table is contained in the root table in the accumulonamespace. The root table is composed of a single tablet, which does notsplit, so it is also called the root tablet. Information about the roottable, such as its location and write-ahead logs, are stored in ZooKeeper.Let’s create a table and pu [...] "url": " /docs/2.x/troubleshooting/system-metadata-tables", "categories": "troubleshooting" }, "docs-2-x-troubleshooting-tools": { "title": "Troubleshooting Tools", - "content": "The accumulo command can be used to run various tools and classes from the command line.RFileInfoThe rfile-info tool will examine an Accumulo storage file and print out basic metadata.$ accumulo rfile-info /accumulo/tables/1/default_tablet/A000000n.rf2013-07-16 08:17:14,778 [util.NativeCodeLoader] INFO : Loaded the native-hadoop libraryLocality group : &lt;DEFAULT&gt; Start block : 0 Num blocks : 1 Index level 0 [...] + "content": "The accumulo command can be used to run various tools and classes from the command line.RFileInfoThe rfile-info tool will examine an Accumulo storage file and print out basic metadata.$ accumulo rfile-info /accumulo/tables/1/default_tablet/A000000n.rf2013-07-16 08:17:14,778 [util.NativeCodeLoader] INFO : Loaded the native-hadoop libraryLocality group : &lt;DEFAULT&gt; Start block : 0 Num blocks : 1 Index level 0 [...] "url": " /docs/2.x/troubleshooting/tools", "categories": "troubleshooting" },