http://git-wip-us.apache.org/repos/asf/accumulo-website/blob/32ac61d3/docs/master/kerberos.md ---------------------------------------------------------------------- diff --git a/docs/master/kerberos.md b/docs/master/kerberos.md new file mode 100644 index 0000000..743285a --- /dev/null +++ b/docs/master/kerberos.md @@ -0,0 +1,663 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +== Kerberos + +=== Overview + +Kerberos is a network authentication protocol that provides a secure way for +peers to prove their identity over an unsecure network in a client-server model. +A centralized key-distribution center (KDC) is the service that coordinates +authentication between a client and a server. Clients and servers use "tickets", +obtained from the KDC via a password or a special file called a "keytab", to +communicate with the KDC and prove their identity. A KDC administrator must +create the principal (name for the client/server identiy) and the password +or keytab, securely passing the necessary information to the actual user/service. +Properly securing the KDC and generated ticket material is central to the security +model and is mentioned only as a warning to administrators running their own KDC. + +To interact with Kerberos programmatically, GSSAPI and SASL are two standards +which allow cross-language integration with Kerberos for authentication. GSSAPI, +the generic security service application program interface, is a standard which +Kerberos implements. In the Java programming language, the language itself also implements +GSSAPI which is leveraged by other applications, like Apache Hadoop and Apache Thrift. +SASL, simple authentication and security layer, is a framework for authentication and +and security over the network. SASL provides a number of mechanisms for authentication, +one of which is GSSAPI. Thus, SASL provides the transport which authenticates +using GSSAPI that Kerberos implements. + +Kerberos is a very complicated software application and is deserving of much +more description than can be provided here. An http://www.roguelynn.com/words/explain-like-im-5-kerberos/[explain like +I'm 5] blog post is very good at distilling the basics, while http://web.mit.edu/kerberos/[MIT Kerberos's project page] +contains lots of documentation for users or administrators. Various Hadoop "vendors" +also provide free documentation that includes step-by-step instructions for +configuring Hadoop and ZooKeeper (which will be henceforth considered as prerequisites). + +=== Within Hadoop + +Out of the box, HDFS and YARN have no ability to enforce that a user is who +they claim they are. Thus, any basic Hadoop installation should be treated as +unsecure: any user with access to the cluster has the ability to access any data. +Using Kerberos to provide authentication, users can be strongly identified, delegating +to Kerberos to determine who a user is and enforce that a user is who they claim to be. +As such, Kerberos is widely used across the entire Hadoop ecosystem for strong +authentication. Since server processes accessing HDFS or YARN are required +to use Kerberos to authenticate with HDFS, it makes sense that they also require +Kerberos authentication from their clients, in addition to other features provided +by SASL. + +A typical deployment involves the creation of Kerberos principals for all server +processes (Hadoop datanodes and namenode(s), ZooKeepers), the creation of a keytab +file for each principal and then proper configuration for the Hadoop site xml files. +Users also need Kerberos principals created for them; however, a user typically +uses a password to identify themselves instead of a keytab. Users can obtain a +ticket granting ticket (TGT) from the KDC using their password which allows them +to authenticate for the lifetime of the TGT (typically one day by default) and alleviates +the need for further password authentication. + +For client server applications, like web servers, a keytab can be created which +allow for fully-automated Kerberos identification removing the need to enter any +password, at the cost of needing to protect the keytab file. These principals +will apply directly to authentication for clients accessing Accumulo and the +Accumulo processes accessing HDFS. + +=== Delegation Tokens + +MapReduce, a common way that clients interact with Accumulo, does not map well to the +client-server model that Kerberos was originally designed to support. Specifically, the parallelization +of tasks across many nodes introduces the problem of securely sharing the user credentials across +these tasks in as safe a manner as possible. To address this problem, Hadoop introduced the notion +of a delegation token to be used in distributed execution settings. + +A delegation token is nothing more than a short-term, on-the-fly password generated after authenticating with the user's +credentials. In Hadoop itself, the Namenode and ResourceManager, for HDFS and YARN respectively, act as the gateway for +delegation tokens requests. For example, before a YARN job is submitted, the implementation will request delegation +tokens from the NameNode and ResourceManager so the YARN tasks can communicate with HDFS and YARN. In the same manner, +support has been added in the Accumulo Master to generate delegation tokens to enable interaction with Accumulo via +MapReduce when Kerberos authentication is enabled in a manner similar to HDFS and YARN. + +Generating an expiring password is, arguably, more secure than distributing the user's +credentials across the cluster as only access to HDFS, YARN and Accumulo would be +compromised in the case of the token being compromised as opposed to the entire +Kerberos credential. Additional details for clients and servers will be covered +in subsequent sections. + +=== Configuring Accumulo + +To configure Accumulo for use with Kerberos, both client-facing and server-facing +changes must be made for a functional system on secured Hadoop. As previously mentioned, +numerous guidelines already exist on the subject of configuring Hadoop and ZooKeeper for +use with Kerberos and won't be covered here. It is assumed that you have functional +Hadoop and ZooKeeper already installed. + +Note that on an existing cluster the server side changes will require a full cluster shutdown and restart. You should +wait to restart the TraceServers until after you've completed the rest of the cluster set up and provisioned +a trace user with appropriate permissions. + +==== Servers + +The first step is to obtain a Kerberos identity for the Accumulo server processes. +When running Accumulo with Kerberos enabled, a valid Kerberos identity will be required +to initiate any RPC between Accumulo processes (e.g. Master and TabletServer) in addition +to any HDFS action (e.g. client to HDFS or TabletServer to HDFS). + +===== Generate Principal and Keytab + +In the +kadmin.local+ shell or using the +-q+ option on +kadmin.local+, create a +principal for Accumulo for all hosts that are running Accumulo processes. A Kerberos +principal is of the form "primary/instance@REALM". "accumulo" is commonly the "primary" +(although not required) and the "instance" is the fully-qualified domain name for +the host that will be running the Accumulo process -- this is required. + +---- +kadmin.local -q "addprinc -randkey accumulo/host.domain.com" +---- + +Perform the above for each node running Accumulo processes in the instance, modifying +"host.domain.com" for your network. The +randkey+ option generates a random password +because we will use a keytab for authentication, not a password, since the Accumulo +server processes don't have an interactive console to enter a password into. + +---- +kadmin.local -q "xst -k accumulo.hostname.keytab accumulo/host.domain.com" +---- + +To simplify deployments, at thet cost of security, all Accumulo principals could +be globbed into a single keytab + +---- +kadmin.local -q "xst -k accumulo.service.keytab -glob accumulo*" +---- + +To ensure that the SASL handshake can occur from clients to servers and servers to servers, +all Accumulo servers must share the same instance and realm principal components as the +"client" needs to know these to set up the connection with the "server". + +===== Server Configuration + +A number of properties need to be changed to account to properly configure servers +in +accumulo-site.xml+. + +[options="header"] +|================================================================= +|Key | Default Value | Description +|general.kerberos.keytab |/etc/security/keytabs/accumulo.service.keytab | +The path to the keytab for Accumulo on local filesystem. Change the value to the actual path on your system. + +|general.kerberos.principal |accumulo/_HOST@REALM | +The Kerberos principal for Accumulo, needs to match the keytab. "_HOST" can be used instead of the actual hostname in the principal and will be automatically expanded to the current FQDN which reduces the configuration file burden. + +|instance.rpc.sasl.enabled |true | +Enables SASL for the Thrift Servers (supports GSSAPI) + +|rpc.sasl.qop |auth | +One of "auth", "auth-int", or "auth-conf". These map to the SASL defined properties for +quality of protection. "auth" is authentication only. "auth-int" is authentication and data +integrity. "auth-conf" is authentication, data integrity and confidentiality. + + +|instance.security.authenticator | +org.apache.accumulo.server.security. +handler.KerberosAuthenticator | +Configures Accumulo to use the Kerberos principal as the Accumulo username/principal + +|instance.security.authorizor | +org.apache.accumulo.server.security. +handler.KerberosAuthorizor | +Configures Accumulo to use the Kerberos principal for authorization purposes + +|instance.security.permissionHandler | +org.apache.accumulo.server.security. +handler.KerberosPermissionHandler| +Configures Accumulo to use the Kerberos principal for permission purposes + +|trace.token.type | +org.apache.accumulo.core.client. +security.tokens.KerberosToken | +Configures the Accumulo Tracer to use the KerberosToken for authentication when serializing traces to the trace table. + +|trace.user |accumulo/_HOST@REALM | +The tracer process needs valid credentials to serialize traces to Accumulo. While the other server processes are +creating a SystemToken from the provided keytab and principal, we can still use a normal KerberosToken and the same +keytab/principal to serialize traces. Like non-Kerberized instances, the table must be created and permissions granted +to the trace.user. The same +_HOST+ replacement is performed on this value, substituted the FQDN for +_HOST+. + +|trace.token.property.keytab || +You can optionally specify the path to a keytab file for the principal given in the +trace.user+ property. If you don't +set this path, it will default to the value given in +general.kerberos.principal+. + +|general.delegation.token.lifetime |7d | +The length of time that the server-side secret used to create delegation tokens is valid. After a server-side secret +expires, a delegation token created with that secret is no longer valid. + +|general.delegation.token.update.interval|1d | +The frequency in which new server-side secrets should be generated to create delegation tokens for clients. Generating +new secrets reduces the likelihood of cryptographic attacks. + +|================================================================= + +Although it should be a prerequisite, it is ever important that you have DNS properly +configured for your nodes and that Accumulo is configured to use the FQDN. It +is extremely important to use the FQDN in each of the "hosts" files for each +Accumulo process: +masters+, +monitors+, +tservers+, +tracers+, and +gc+. + +Normally, no changes are needed in +accumulo-env.sh+ to enable Kerberos. Typically, the +krb5.conf+ +is installed on the local machine in +/etc/+, and the Java library implementations will look +here to find the necessary configuration to communicate with the KDC. Some installations +may require a different +krb5.conf+ to be used for Accumulo which can be accomplished +by adding the JVM system property +-Djava.security.krb5.conf=/path/to/other/krb5.conf+ to ++JAVA_OPTS+ in +accumulo-env.sh+. + +===== KerberosAuthenticator + +The +KerberosAuthenticator+ is an implementation of the pluggable security interfaces +that Accumulo provides. It builds on top of what the default ZooKeeper-based implementation, +but removes the need to create user accounts with passwords in Accumulo for clients. As +long as a client has a valid Kerberos identity, they can connect to and interact with +Accumulo, but without any permissions (e.g. cannot create tables or write data). Leveraging +ZooKeeper removes the need to change the permission handler and authorizor, so other Accumulo +functions regarding permissions and cell-level authorizations do not change. + +It is extremely important to note that, while user operations like +SecurityOperations.listLocalUsers()+, ++SecurityOperations.dropLocalUser()+, and +SecurityOperations.createLocalUser()+ will not return +errors, these methods are not equivalent to normal installations, as they will only operate on +users which have, at one point in time, authenticated with Accumulo using their Kerberos identity. +The KDC is still the authoritative entity for user management. The previously mentioned methods +are provided as they simplify management of users within Accumulo, especially with respect +to granting Authorizations and Permissions to new users. + +===== Administrative User + +Out of the box (without Kerberos enabled), Accumulo has a single user with administrative permissions "root". +This users is used to "bootstrap" other users, creating less-privileged users for applications using +the system. In Kerberos, to authenticate with the system, it's required that the client presents Kerberos +credentials for the principal (user) the client is trying to authenticate as. + +Because of this, an administrative user named "root" would be useless in an instance using Kerberos, +because it is very unlikely to have Kerberos credentials for a principal named `root`. When Kerberos is +enabled, Accumulo will prompt for the name of a user to grant the same permissions as what the `root` +user would normally have. The name of the Accumulo user to grant administrative permissions to can +also be given by the `-u` or `--user` options. + +If you are enabling Kerberos on an existing cluster, you will need to reinitialize the security system in +order to replace the existing "root" user with one that can be used with Kerberos. These steps should be +completed after you have done the previously described configuration changes and will require access to +a complete +accumulo-site.xml+, including the instance secret. Note that this process will delete all +existing users in the system; you will need to reassign user permissions based on Kerberos principals. + +1. Ensure Accumulo is not running. +2. Given the path to a +accumulo-site.xml+ with the instance secret, run the security reset tool. If you are +prompted for a password you can just hit return, since it won't be used. +3. Start the Accumulo cluster + +---- +$ accumulo-cluster stop +... +$ accumulo init --reset-security +Running against secured HDFS +Principal (user) to grant administrative privileges to : acculumo_ad...@example.com +Enter initial password for accumulo_ad...@example.com (this may not be applicable for your security setup): +Confirm initial password for accumulo_ad...@example.com: +$ accumulo-cluster start +... +$ +---- + +===== Verifying secure access + +To verify that servers have correctly started with Kerberos enabled, ensure that the processes +are actually running (they should exit immediately if login fails) and verify that you see +something similar to the following in the application log. + +---- +2015-01-07 11:57:56,826 [security.SecurityUtil] INFO : Attempting to login with keytab as accumulo/hostn...@example.com +2015-01-07 11:57:56,830 [security.UserGroupInformation] INFO : Login successful for user accumulo/hostn...@example.com using keytab file /etc/security/keytabs/accumulo.service.keytab +---- + +===== Impersonation + +Impersonation is functionality which allows a certain user to act as another. One direct application +of this concept within Accumulo is the Thrift proxy. The Thrift proxy is configured to accept +user requests and pass them onto Accumulo, enabling client access to Accumulo via any thrift-compatible +language. When the proxy is running with SASL transports, this enforces that clients present a valid +Kerberos identity to make a connection. In this situation, the Thrift proxy server does not have +access to the secret key material in order to make a secure connection to Accumulo as the client, +it can only connect to Accumulo as itself. Impersonation, in this context, refers to the ability +of the proxy to authenticate to Accumulo as itself, but act on behalf of an Accumulo user. + +Accumulo supports basic impersonation of end-users by a third party via static rules in Accumulo's +site configuration file. These two properties are semi-colon separated properties which are aligned +by index. This first element in the user impersonation property value matches the first element +in the host impersonation property value, etc. + +---- +<property> + <name>instance.rpc.sasl.allowed.user.impersonation</name> + <value>$PROXY_USER:*</value> +</property> + +<property> + <name>instance.rpc.sasl.allowed.host.impersonation</name> + <value>*</value> +</property> +---- + +Here, +$PROXY_USER+ can impersonate any user from any host. + +The following is an example of specifying a subset of users +$PROXY_USER+ can impersonate and also +limiting the hosts from which +$PROXY_USER+ can initiate requests from. + +---- +<property> + <name>instance.rpc.sasl.allowed.user.impersonation</name> + <value>$PROXY_USER:user1,user2;$PROXY_USER2:user2,user4</value> +</property> + +<property> + <name>instance.rpc.sasl.allowed.host.impersonation</name> + <value>host1.domain.com,host2.domain.com;*</value> +</property> +---- + +Here, +$PROXY_USER+ can impersonate user1 and user2 only from host1.domain.com or host2.domain.com. ++$PROXY_USER2+ can impersonate user2 and user4 from any host. + +In these examples, the value +$PROXY_USER+ is the Kerberos principal of the server which is acting on behalf of a user. +Impersonation is enforced by the Kerberos principal and the host from which the RPC originated (from the perspective +of the Accumulo TabletServers/Masters). An asterisk (*) can be used to specify all users or all hosts (depending on the context). + +===== Delegation Tokens + +Within Accumulo services, the primary task to implement delegation tokens is the generation and distribution +of a shared secret among all Accumulo tabletservers and the master. The secret key allows for generation +of delegation tokens for users and verification of delegation tokens presented by clients. If a server +process is unaware of the secret key used to create a delegation token, the client cannot be authenticated. +As ZooKeeper distribution is an asynchronous operation (typically on the order of seconds), the +value for `general.delegation.token.update.interval` should be on the order of hours to days to reduce the +likelihood of servers rejecting valid clients because the server did not yet see a new secret key. + +Supporting authentication with both Kerberos credentials and delegation tokens, the SASL thrift +server accepts connections with either `GSSAPI` and `DIGEST-MD5` mechanisms set. The `DIGEST-MD5` mechanism +enables authentication as a normal username and password exchange which `DelegationToken`s leverages. + +Since delegation tokens are a weaker form of authentication than Kerberos credentials, user access +to obtain delegation tokens from Accumulo is protected with the `DELEGATION_TOKEN` system permission. Only +users with the system permission are allowed to obtain delegation tokens. It is also recommended +to configure confidentiality with SASL, using the `rpc.sasl.qop=auth-conf` configuration property, to +ensure that prying eyes cannot view the `DelegationToken` as it passes over the network. + +---- +# Check a user's permissions +admin@REALM@accumulo> userpermissions -u user@REALM + +# Grant the DELEGATION_TOKEN system permission to a user +admin@REALM@accumulo> grant System.DELEGATION_TOKEN -s -u user@REALM +---- + +==== Clients + +===== Create client principal + +Like the Accumulo servers, clients must also have a Kerberos principal created for them. The +primary difference between a server principal is that principals for users are created +with a password and also not qualified to a specific instance (host). + +---- +kadmin.local -q "addprinc $user" +---- + +The above will prompt for a password for that user which will be used to identify that $user. +The user can verify that they can authenticate with the KDC using the command `kinit $user`. +Upon entering the correct password, a local credentials cache will be made which can be used +to authenticate with Accumulo, access HDFS, etc. + +The user can verify the state of their local credentials cache by using the command `klist`. + +---- +$ klist +Ticket cache: FILE:/tmp/krb5cc_123 +Default principal: u...@example.com + +Valid starting Expires Service principal +01/07/2015 11:56:35 01/08/2015 11:56:35 krbtgt/example....@example.com + renew until 01/14/2015 11:56:35 +---- + +===== Configuration + +The second thing clients need to do is to set up their client configuration file. By +default, this file is stored in +~/.accumulo/config+ or +/path/to/accumulo/client.conf+. +Accumulo utilities also allow you to provide your own copy of this file in any location +using the +--config-file+ command line option. + +Three items need to be set to enable access to Accumulo: + +* +instance.rpc.sasl.enabled+=_true_ +* +rpc.sasl.qop+=_auth_ +* +kerberos.server.primary+=_accumulo_ + +Each of these properties *must* match the configuration of the accumulo servers; this is +required to set up the SASL transport. + +===== Verifying Administrative Access + +At this point you should have enough configured on the server and client side to interact with +the system. You should verify that the administrative user you chose earlier can successfully +interact with the sytem. + +While this example logs in via +kinit+ with a password, any login method that caches Kerberos tickets +should work. + +---- +$ kinit accumulo_ad...@example.com +Password for accumulo_ad...@example.com: ****************************** +$ accumulo shell + +Shell - Apache Accumulo Interactive Shell +- +- version: 1.7.2 +- instance name: MYACCUMULO +- instance id: 483b9038-889f-4b2d-b72b-dfa2bb5dbd07 +- +- type 'help' for a list of available commands +- +accumulo_ad...@example.com@MYACCUMULO> userpermissions +System permissions: System.GRANT, System.CREATE_TABLE, System.DROP_TABLE, System.ALTER_TABLE, System.CREATE_USER, System.DROP_USER, System.ALTER_USER, System.SYSTEM, System.CREATE_NAMESPACE, System.DROP_NAMESPACE, System.ALTER_NAMESPACE, System.OBTAIN_DELEGATION_TOKEN + +Namespace permissions (accumulo): Namespace.READ, Namespace.ALTER_TABLE + +Table permissions (accumulo.metadata): Table.READ, Table.ALTER_TABLE +Table permissions (accumulo.replication): Table.READ +Table permissions (accumulo.root): Table.READ, Table.ALTER_TABLE + +accumulo_ad...@example.com@MYACCUMULO> quit +$ kdestroy +$ +---- + +===== DelegationTokens with MapReduce + +To use DelegationTokens in a custom MapReduce job, the call to `setConnectorInfo()` method +on `AccumuloInputFormat` or `AccumuloOutputFormat` should be the only necessary change. Instead +of providing an instance of a `KerberosToken`, the user must call `SecurityOperations.getDelegationToken` +using a `Connector` obtained with that `KerberosToken`, and pass the `DelegationToken` to +`setConnectorInfo` instead of the `KerberosToken`. It is expected that the user launching +the MapReduce job is already logged in via Kerberos via a keytab or via a locally-cached +Kerberos ticket-granting-ticket (TGT). + +[source,java] +---- +Instance instance = getInstance(); +KerberosToken kt = new KerberosToken(); +Connector conn = instance.getConnector(principal, kt); +DelegationToken dt = conn.securityOperations().getDelegationToken(); + +// Reading from Accumulo +AccumuloInputFormat.setConnectorInfo(job, principal, dt); + +// Writing to Accumulo +AccumuloOutputFormat.setConnectorInfo(job, principal, dt); +---- + +If the user passes a `KerberosToken` to the `setConnectorInfo` method, the implementation will +attempt to obtain a `DelegationToken` automatically, but this does have limitations +based on the other MapReduce configuration methods already called and permissions granted +to the calling user. It is best for the user to acquire the DelegationToken on their own +and provide it directly to `setConnectorInfo`. + +Users must have the `DELEGATION_TOKEN` system permission to call the `getDelegationToken` +method. The obtained delegation token is only valid for the requesting user for a period +of time dependent on Accumulo's configuration (`general.delegation.token.lifetime`). + +It is also possible to obtain and use `DelegationToken`s outside of the context +of MapReduce. + +[source,java] +---- +String principal = "user@REALM"; +Instance instance = getInstance(); +Connector connector = instance.getConnector(principal, new KerberosToken()); +DelegationToken delegationToken = connector.securityOperations().getDelegationToken(); + +Connector dtConnector = instance.getConnector(principal, delegationToken); +---- + +Use of the `dtConnector` will perform each operation as the original user, but without +their Kerberos credentials. + +For the duration of validity of the `DelegationToken`, the user *must* take the necessary precautions +to protect the `DelegationToken` from prying eyes as it can be used by any user on any host to impersonate +the user who requested the `DelegationToken`. YARN ensures that passing the delegation token from the client +JVM to each YARN task is secure, even in multi-tenant instances. + +==== Debugging + +*Q*: I have valid Kerberos credentials and a correct client configuration file but +I still get errors like: + +---- +java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] +---- + +*A*: When you have a valid client configuration and Kerberos TGT, it is possible that the search +path for your local credentials cache is incorrect. Check the value of the KRB5CCNAME environment +value, and ensure it matches the value reported by `klist`. + +---- +$ echo $KRB5CCNAME + +$ klist +Ticket cache: FILE:/tmp/krb5cc_123 +Default principal: u...@example.com + +Valid starting Expires Service principal +01/07/2015 11:56:35 01/08/2015 11:56:35 krbtgt/example....@example.com + renew until 01/14/2015 11:56:35 +$ export KRB5CCNAME=/tmp/krb5cc_123 +$ echo $KRB5CCNAME +/tmp/krb5cc_123 +---- + +*Q*: I thought I had everything configured correctly, but my client/server still fails to log in. +I don't know what is actually failing. + +*A*: Add the following system property to the JVM invocation: + +---- +-Dsun.security.krb5.debug=true +---- + +This will enable lots of extra debugging at the JVM level which is often sufficient to +diagnose some high-level configuration problem. Client applications can add this system property by +hand to the command line and Accumulo server processes or applications started using the `accumulo` +script by adding the property to +JAVA_OPTS+ in +accumulo-env.sh+. + +Additionally, you can increase the log4j levels on +org.apache.hadoop.security+, which includes the +Hadoop +UserGroupInformation+ class, which will include some high-level debug statements. This +can be controlled in your client application, or using +log4j-service.properties+ + +*Q*: All of my Accumulo processes successfully start and log in with their +keytab, but they are unable to communicate with each other, showing the +following errors: + +---- +2015-01-12 14:47:27,055 [transport.TSaslTransport] ERROR: SASL negotiation failure +javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - LOOKING_UP_SERVER)] + at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212) + at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) + at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253) + at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) + at org.apache.accumulo.core.rpc.UGIAssumingTransport$1.run(UGIAssumingTransport.java:53) + at org.apache.accumulo.core.rpc.UGIAssumingTransport$1.run(UGIAssumingTransport.java:49) + at java.security.AccessController.doPrivileged(Native Method) + at javax.security.auth.Subject.doAs(Subject.java:415) + at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) + at org.apache.accumulo.core.rpc.UGIAssumingTransport.open(UGIAssumingTransport.java:49) + at org.apache.accumulo.core.rpc.ThriftUtil.createClientTransport(ThriftUtil.java:357) + at org.apache.accumulo.core.rpc.ThriftUtil.createTransport(ThriftUtil.java:255) + at org.apache.accumulo.server.master.LiveTServerSet$TServerConnection.getTableMap(LiveTServerSet.java:106) + at org.apache.accumulo.master.Master.gatherTableInformation(Master.java:996) + at org.apache.accumulo.master.Master.access$600(Master.java:160) + at org.apache.accumulo.master.Master$StatusThread.updateStatus(Master.java:911) + at org.apache.accumulo.master.Master$StatusThread.run(Master.java:901) +Caused by: GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - LOOKING_UP_SERVER) + at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:710) + at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:248) + at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) + at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193) + ... 16 more +Caused by: KrbException: Server not found in Kerberos database (7) - LOOKING_UP_SERVER + at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:73) + at sun.security.krb5.KrbTgsReq.getReply(KrbTgsReq.java:192) + at sun.security.krb5.KrbTgsReq.sendAndGetCreds(KrbTgsReq.java:203) + at sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:309) + at sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:115) + at sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:454) + at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:641) + ... 19 more +Caused by: KrbException: Identifier doesn't match expected value (906) + at sun.security.krb5.internal.KDCRep.init(KDCRep.java:143) + at sun.security.krb5.internal.TGSRep.init(TGSRep.java:66) + at sun.security.krb5.internal.TGSRep.<init>(TGSRep.java:61) + at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:55) + ... 25 more +---- + +or + +---- +2015-01-12 14:47:29,440 [server.TThreadPoolServer] ERROR: Error occurred during processing of message. +java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: Peer indicated failure: GSS initiate failed + at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) + at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:51) + at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:48) + at java.security.AccessController.doPrivileged(Native Method) + at javax.security.auth.Subject.doAs(Subject.java:356) + at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1608) + at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory.getTransport(UGIAssumingTransportFactory.java:48) + at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:208) + at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) + at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) + at java.lang.Thread.run(Thread.java:745) +Caused by: org.apache.thrift.transport.TTransportException: Peer indicated failure: GSS initiate failed + at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:190) + at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125) + at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253) + at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) + at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) + ... 10 more +---- + +*A*: As previously mentioned, the hostname, and subsequently the address each Accumulo process is bound/listening +on, is extremely important when negotiating an SASL connection. This problem commonly arises when the Accumulo +servers are not configured to listen on the address denoted by their FQDN. + +The values in the Accumulo "hosts" files (In +accumulo/conf+: +masters+, +monitors+, +tservers+, +tracers+, +and +gc+) should match the instance componentof the Kerberos server principal (e.g. +host+ in +accumulo/h...@example.com+). + +*Q*: After configuring my system for Kerberos, server processes come up normally and I can interact with the system. However, +when I attempt to use the "Recent Traces" page on the Monitor UI I get a stacktrace similar to: + +---- + java.lang.AssertionError: AuthenticationToken should not be null + at org.apache.accumulo.monitor.servlets.trace.Basic.getScanner(Basic.java:139) + at org.apache.accumulo.monitor.servlets.trace.Summary.pageBody(Summary.java:164) + at org.apache.accumulo.monitor.servlets.BasicServlet.doGet(BasicServlet.java:63) + at javax.servlet.http.HttpServlet.service(HttpServlet.java:687) + at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) + at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:738) + at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:551) + at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) + at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:568) + at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:221) + at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1111) + at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:478) + at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:183) + at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1045) + at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) + at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) + at org.eclipse.jetty.server.Server.handle(Server.java:462) + at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:279) + at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:232) + at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:534) + at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:607) + at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:536) + at java.lang.Thread.run(Thread.java:745) + +---- + +*A*: This indicates that the Monitor has not been able to successfully log in a client-side user to read from the +trace+ table. Accumulo allows the TraceServer to rely on the property +general.kerberos.keytab+ as a fallback when logging in the trace user if the +trace.token.property.keytab+ property isn't defined. Some earlier versions of Accumulo did not do this same fallback for the Monitor's use of the trace user. The end result is that if you configure +general.kerberos.keytab+ and not +trace.token.property.keytab+ you will end up with a system that properly logs trace information but can't view it. + +Ensure you have set +trace.token.property.keytab+ to point to a keytab for the principal defined in +trace.user+ in the +accumulo-site.xml+ file for the Monitor, since that should work in all versions of Accumulo.
http://git-wip-us.apache.org/repos/asf/accumulo-website/blob/32ac61d3/docs/master/multivolume.md ---------------------------------------------------------------------- diff --git a/docs/master/multivolume.md b/docs/master/multivolume.md new file mode 100644 index 0000000..8921f2d --- /dev/null +++ b/docs/master/multivolume.md @@ -0,0 +1,82 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +== Multi-Volume Installations + +This is an advanced configuration setting for very large clusters +under a lot of write pressure. + +The HDFS NameNode holds all of the metadata about the files in +HDFS. For fast performance, all of this information needs to be stored +in memory. A single NameNode with 64G of memory can store the +metadata for tens of millions of files.However, when scaling beyond a +thousand nodes, an active Accumulo system can generate lots of updates +to the file system, especially when data is being ingested. The large +number of write transactions to the NameNode, and the speed of a +single edit log, can become the limiting factor for large scale +Accumulo installations. + +You can see the effect of slow write transactions when the Accumulo +Garbage Collector takes a long time (more than 5 minutes) to delete +the files Accumulo no longer needs. If your Garbage Collector +routinely runs in less than a minute, the NameNode is performing well. + +However, if you do begin to experience slow-down and poor GC +performance, Accumulo can be configured to use multiple NameNode +servers. The configuration ``instance.volumes'' should be set to a +comma-separated list, using full URI references to different NameNode +servers: + +[source,xml] +<property> + <name>instance.volumes</name> + <value>hdfs://ns1:9001,hdfs://ns2:9001</value> +</property> + +The introduction of multiple volume support in 1.6 changed the way Accumulo +stores pointers to files. It now stores fully qualified URI references to +files. Before 1.6, Accumulo stored paths that were relative to a table +directory. After an upgrade these relative paths will still exist and are +resolved using instance.dfs.dir, instance.dfs.uri, and Hadoop configuration in +the same way they were before 1.6. + +If the URI for a namenode changes (e.g. namenode was running on host1 and its +moved to host2), then Accumulo will no longer function. Even if Hadoop and +Accumulo configurations are changed, the fully qualified URIs stored in +Accumulo will still contain the old URI. To handle this Accumulo has the +following configuration property for replacing URI stored in its metadata. The +example configuration below will replace ns1 with nsA and ns2 with nsB in +Accumulo metadata. For this property to take affect, Accumulo will need to be +restarted. + +[source,xml] +<property> + <name>instance.volumes.replacements</name> + <value>hdfs://ns1:9001 hdfs://nsA:9001, hdfs://ns2:9001 hdfs://nsB:9001</value> +</property> + +Using viewfs or HA namenode, introduced in Hadoop 2, offers another option for +managing the fully qualified URIs stored in Accumulo. Viewfs and HA namenode +both introduce a level of indirection in the Hadoop configuration. For +example assume viewfs:///nn1 maps to hdfs://nn1 in the Hadoop configuration. +If viewfs://nn1 is used by Accumulo, then its easy to map viewfs://nn1 to +hdfs://nnA by changing the Hadoop configuration w/o doing anything to Accumulo. +A production system should probably use a HA namenode. Viewfs may be useful on +a test system with a single non HA namenode. + +You may also want to configure your cluster to use Federation, +available in Hadoop 2.0, which allows DataNodes to respond to multiple +NameNode servers, so you do not have to partition your DataNodes by +NameNode. http://git-wip-us.apache.org/repos/asf/accumulo-website/blob/32ac61d3/docs/master/replication.md ---------------------------------------------------------------------- diff --git a/docs/master/replication.md b/docs/master/replication.md new file mode 100644 index 0000000..fd8905e --- /dev/null +++ b/docs/master/replication.md @@ -0,0 +1,399 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +== Replication + +=== Overview + +Replication is a feature of Accumulo which provides a mechanism to automatically +copy data to other systems, typically for the purpose of disaster recovery, +high availability, or geographic locality. It is best to consider this feature +as a framework for automatic replication instead of the ability to copy data +from to another Accumulo instance as copying to another Accumulo cluster is +only an implementation detail. The local Accumulo cluster is hereby referred +to as the +primary+ while systems being replicated to are known as ++peers+. + +This replication framework makes two Accumulo instances, where one instance +replicates to another, eventually consistent between one another, as opposed +to the strong consistency that each single Accumulo instance still holds. That +is to say, attempts to read data from a table on a peer which has pending replication +from the primary will not wait for that data to be replicated before running the scan. +This is desirable for a number of reasons, the most important is that the replication +framework is not limited by network outages or offline peers, but only by the HDFS +space available on the primary system. + +Replication configurations can be considered as a directed graph which allows cycles. +The systems in which data was replicated from is maintained in each Mutation which +allow each system to determine if a peer has already has the data in which +the system wants to send. + +Data is replicated by using the Write-Ahead logs (WAL) that each TabletServer is +already maintaining. TabletServers records which WALs have data that need to be +replicated to the +accumulo.metadata+ table. The Master uses these records, +combined with the local Accumulo table that the WAL was used with, to create records +in the +replication+ table which track which peers the given WAL should be +replicated to. The Master latter uses these work entries to assign the actual +replication task to a local TabletServer using ZooKeeper. A TabletServer will get +a lock in ZooKeeper for the replication of this file to a peer, and proceed to +replicate to the peer, recording progress in the +replication+ table as +data is successfully replicated on the peer. Later, the Master and Garbage Collector +will remove records from the +accumulo.metadata+ and +replication+ tables +and files from HDFS, respectively, after replication to all peers is complete. + +=== Configuration + +Configuration of Accumulo to replicate data to another system can be categorized +into the following sections. + +==== Site Configuration + +Each system involved in replication (even the primary) needs a name that uniquely +identifies it across all peers in the replication graph. This should be considered +fixed for an instance, and set in +accumulo-site.xml+. + +---- +<property> + <name>replication.name</name> + <value>primary</value> + <description>Unique name for this system used by replication</description> +</property> +---- + +==== Instance Configuration + +For each peer of this system, Accumulo needs to know the name of that peer, +the class used to replicate data to that system and some configuration information +to connect to this remote peer. In the case of Accumulo, this additional data +is the Accumulo instance name and ZooKeeper quorum; however, this varies on the +replication implementation for the peer. + +These can be set in the site configuration to ease deployments; however, as they may +change, it can be useful to set this information using the Accumulo shell. + +To configure a peer with the name +peer1+ which is an Accumulo system with an instance name of +accumulo_peer+ +and a ZooKeeper quorum of +10.0.0.1,10.0.2.1,10.0.3.1+, invoke the following +command in the shell. + +---- +root@accumulo_primary> config -s +replication.peer.peer1=org.apache.accumulo.tserver.replication.AccumuloReplicaSystem,accumulo_peer,10.0.0.1,10.0.2.1,10.0.3.1 +---- + +Since this is an Accumulo system, we also want to set a username and password +to use when authenticating with this peer. On our peer, we make a special user +which has permission to write to the tables we want to replicate data into, "replication" +with a password of "password". We then need to record this in the primary's configuration. + +---- +root@accumulo_primary> config -s replication.peer.user.peer1=replication +root@accumulo_primary> config -s replication.peer.password.peer1=password +---- + +Alternatively, when configuring replication on Accumulo running Kerberos, a keytab +file per peer can be configured instead of a password. The provided keytabs must be readable +by the unix user running Accumulo. They keytab for a peer can be unique from the +keytab used by Accumulo or any keytabs for other peers. + +---- +accum...@example.com@accumulo_primary> config -s replication.peer.user.peer1=replicat...@example.com +accum...@example.com@accumulo_primary> config -s replication.peer.keytab.peer1=/path/to/replication.keytab +---- + +==== Table Configuration + +Now, we presently have a peer defined, so we just need to configure which tables will +replicate to that peer. We also need to configure an identifier to determine where +this data will be replicated on the peer. Since we're replicating to another Accumulo +cluster, this is a table ID. In this example, we want to enable replication on ++my_table+ and configure our peer +accumulo_peer+ as a target, sending +the data to the table with an ID of +2+ in +accumulo_peer+. + +---- +root@accumulo_primary> config -t my_table -s table.replication=true +root@accumulo_primary> config -t my_table -s table.replication.target.accumulo_peer=2 +---- + +To replicate a single table on the primary to multiple peers, the second command +in the above shell snippet can be issued, for each peer and remote identifier pair. + +=== Monitoring + +Basic information about replication status from a primary can be found on the Accumulo +Monitor server, using the +Replication+ link the sidebar. + +On this page, information is broken down into the following sections: + +1. Files pending replication by peer and target +2. Files queued for replication, with progress made + +=== Work Assignment + +Depending on the schema of a table, different implementations of the WorkAssigner used could +be configured. The implementation is controlled via the property +replication.work.assigner+ +and the full class name for the implementation. This can be configured via the shell or ++accumulo-site.xml+. + +---- +<property> + <name>replication.work.assigner</name> + <value>org.apache.accumulo.master.replication.SequentialWorkAssigner</value> + <description>Implementation used to assign work for replication</description> +</property> +---- + +---- +root@accumulo_primary> config -t my_table -s replication.work.assigner=org.apache.accumulo.master.replication.SequentialWorkAssigner +---- + +Two implementations are provided. By default, the +SequentialWorkAssigner+ is configured for an +instance. The SequentialWorkAssigner ensures that, per peer and each remote identifier, each WAL is +replicated in the order in which they were created. This is sufficient to ensure that updates to a table +will be replayed in the correct order on the peer. This implementation has the downside of only replicating +a single WAL at a time. + +The second implementation, the +UnorderedWorkAssigner+ can be used to overcome the limitation +of only a single WAL being replicated to a target and peer at any time. Depending on the table schema, +it's possible that multiple versions of the same Key with different values are infrequent or nonexistent. +In this case, parallel replication to a peer and target is possible without any downsides. In the case +where this implementation is used were column updates are frequent, it is possible that there will be +an inconsistency between the primary and the peer. + +=== ReplicaSystems + ++ReplicaSystem+ is the interface which allows abstraction of replication of data +to peers of various types. Presently, only an +AccumuloReplicaSystem+ is provided +which will replicate data to another Accumulo instance. A +ReplicaSystem+ implementation +is run inside of the TabletServer process, and can be configured as mentioned in the ++Instance Configuration+ section of this document. Theoretically, an implementation +of this interface could send data to other filesystems, databases, etc. + +==== AccumuloReplicaSystem + +The +AccumuloReplicaSystem+ uses Thrift to communicate with a peer Accumulo instance +and replicate the necessary data. The TabletServer running on the primary will communicate +with the Master on the peer to request the address of a TabletServer on the peer which +this TabletServer will use to replicate the data. + +The TabletServer on the primary will then replicate data in batches of a configurable +size (+replication.max.unit.size+). The TabletServer on the peer will report how many +records were applied back to the primary, which will be used to record how many records +were successfully replicated. The TabletServer on the primary will continue to replicate +data in these batches until no more data can be read from the file. + +=== Other Configuration + +There are a number of configuration values that can be used to control how +the implementation of various components operate. + +[width="75%",cols=">,^2,^2"] +[options="header"] +|==== +|Property | Description | Default +|replication.max.work.queue | Maximum number of files queued for replication at one time | 1000 +|replication.work.assignment.sleep | Time between invocations of the WorkAssigner | 30s +|replication.worker.threads | Size of threadpool used to replicate data to peers | 4 +|replication.receipt.service.port | Thrift service port to listen for replication requests, can use '0' for a random port | 10002 +|replication.work.attempts | Number of attempts to replicate to a peer before aborting the attempt | 10 +|replication.receiver.min.threads | Minimum number of idle threads for handling incoming replication | 1 +|replication.receiver.threadcheck.time | Time between attempting adjustments of thread pool for incoming replications | 30s +|replication.max.unit.size | Maximum amount of data to be replicated in one RPC | 64M +|replication.work.assigner | Work Assigner implementation | org.apache.accumulo.master.replication.SequentialWorkAssigner +|tserver.replication.batchwriter.replayer.memory| Size of BatchWriter cache to use in applying replication requests | 50M +|==== + +=== Example Practical Configuration + +A real-life example is now provided to give concrete application of replication configuration. This +example is a two instance Accumulo system, one primary system and one peer system. They are called +primary and peer, respectively. Each system also have a table of the same name, "my_table". The instance +name for each is also the same (primary and peer), and both have ZooKeeper hosts on a node with a hostname +with that name as well (primary:2181 and peer:2181). + +We want to configure these systems so that "my_table" on "primary" replicates to "my_table" on "peer". + +==== accumulo-site.xml + +We can assign the "unique" name that identifies this Accumulo instance among all others that might participate +in replication together. In this example, we will use the names provided in the description. + +===== Primary + +---- +<property> + <name>replication.name</name> + <value>primary</value> + <description>Defines the unique name</description> +</property> +---- + +===== Peer + +---- +<property> + <name>replication.name</name> + <value>peer</value> +</property> +---- + +==== masters and tservers files + +Be *sure* to use non-local IP addresses. Other nodes need to connect to it and using localhost will likely result in +a local node talking to another local node. + +==== Start both instances + +The rest of the configuration is dynamic and is best configured on the fly (in ZooKeeper) than in accumulo-site.xml. + +==== Peer + +The next series of command are to be run on the peer system. Create a user account for the primary instance called +"peer". The password for this account will need to be saved in the configuration on the primary + +---- +root@peer> createtable my_table +root@peer> createuser peer +root@peer> grant -t my_table -u peer Table.WRITE +root@peer> grant -t my_table -u peer Table.READ +root@peer> tables -l +---- + +Remember what the table ID for 'my_table' is. You'll need that to configured the primary instance. + +==== Primary + +Next, configure the primary instance. + +===== Set up the table + +---- +root@primary> createtable my_table +---- + +===== Define the Peer as a replication peer to the Primary + +We're defining the instance with replication.name of 'peer' as a peer. We provide the implementation of ReplicaSystem +that we want to use, and the configuration for the AccumuloReplicaSystem. In this case, the configuration is the Accumulo +Instance name for 'peer' and the ZooKeeper quorum string. The configuration key is of the form +"replication.peer.$peer_name". + +---- +root@primary> config -s replication.peer.peer=org.apache.accumulo.tserver.replication.AccumuloReplicaSystem,peer,$peer_zk_quorum +---- + +===== Set the authentication credentials + +We want to use that special username and password that we created on the peer, so we have a means to write data to +the table that we want to replicate to. The configuration key is of the form "replication.peer.user.$peer_name". + +---- +root@primary> config -s replication.peer.user.peer=peer +root@primary> config -s replication.peer.password.peer=peer +---- + +===== Enable replication on the table + +Now that we have defined the peer on the primary and provided the authentication credentials, we need to configure +our table with the implementation of ReplicaSystem we want to use to replicate to the peer. In this case, our peer +is an Accumulo instance, so we want to use the AccumuloReplicaSystem. + +The configuration for the AccumuloReplicaSystem is the table ID for the table on the peer instance that we +want to replicate into. Be sure to use the correct value for $peer_table_id. The configuration key is of +the form "table.replication.target.$peer_name". + +---- +root@primary> config -t my_table -s table.replication.target.peer=$peer_table_id +---- + +Finally, we can enable replication on this table. + +---- +root@primary> config -t my_table -s table.replication=true +---- + +=== Extra considerations for use + +While this feature is intended for general-purpose use, its implementation does carry some baggage. Like any software, +replication is a feature that operates well within some set of use cases but is not meant to support all use cases. +For the benefit of the users, we can enumerate these cases. + +==== Latency + +As previously mentioned, the replication feature uses the Write-Ahead Log files for a number of reasons, one of which +is to prevent the need for data to be written to RFiles before it is available to be replicated. While this can help +reduce the latency for a batch of Mutations that have been written to Accumulo, the latency is at least seconds to tens +of seconds for replication once ingest is active. For a table which replication has just been enabled on, this is likely +to take a few minutes before replication will begin. + +Once ingest is active and flowing into the system at a regular rate, replication should be occurring at a similar rate, +given sufficient computing resources. Replication attempts to copy data at a rate that is to be considered low latency +but is not a replacement for custom indexing code which can ensure near real-time referential integrity on secondary indexes. + +==== Table-Configured Iterators + +Accumulo Iterators tend to be a heavy hammer which can be used to solve a variety of problems. In general, it is highly +recommended that Iterators which are applied at major compaction time are both idempotent and associative due to the +non-determinism in which some set of files for a Tablet might be compacted. In practice, this translates to common patterns, +such as aggregation, which are implemented in a manner resilient to duplication (such as using a Set instead of a List). + +Due to the asynchronous nature of replication and the expectation that hardware failures and network partitions will exist, +it is generally not recommended to not configure replication on a table which has Iterators set which are not idempotent. +While the replication implementation can make some simple assertions to try to avoid re-replication of data, it is not +presently guaranteed that all data will only be sent to a peer once. Data will be replicated at least once. Typically, +this is not a problem as the VersioningIterator will automaticaly deduplicate this over-replication because they will +have the same timestamp; however, certain Combiners may result in inaccurate aggregations. + +As a concrete example, consider a table which has the SummingCombiner configured to sum all values for +multiple versions of the same Key. For some key, consider a set of numeric values that are written to a table on the +primary: [1, 2, 3]. On the primary, all of these are successfully written and thus the current value for the given key +would be 6, (1 + 2 + 3). Consider, however, that each of these updates to the peer were done independently (because +other data was also included in the write-ahead log that needed to be replicated). The update with a value of 1 was +successfully replicated, and then we attempted to replicate the update with a value of 2 but the remote server never +responded. The primary does not know whether the update with a value of 2 was actually applied or not, so the +only recourse is to re-send the update. After we receive confirmation that the update with a value of 2 was replicated, +we will then replicate the update with 3. If the peer did never apply the first update of '2', the summation is accurate. +If the update was applied but the acknowledgement was lost for some reason (system failure, network partition), the +update will be resent to the peer. Because addition is non-idempotent, we have created an inconsistency between the +primary and peer. As such, the SummingCombiner wouldn't be recommended on a table being replicated. + +While there are changes that could be made to the replication implementation which could attempt to mitigate this risk, +presently, it is not recommended to configure Iterators or Combiners which are not idempotent to support cases where +inaccuracy of aggregations is not acceptable. + +==== Duplicate Keys + +In Accumulo, when more than one key exists that are exactly the same, keys that are equal down to the timestamp, +the retained value is non-deterministic. Replication introduces another level of non-determinism in this case. +For a table that is being replicated and has multiple equal keys with different values inserted into it, the final +value in that table on the primary instance is not guaranteed to be the final value on all replicas. + +For example, say the values that were inserted on the primary instance were +value1+ and +value2+ and the final +value was +value1+, it is not guaranteed that all replicas will have +value1+ like the primary. The final value is +non-deterministic for each instance. + +As is the recommendation without replication enabled, if multiple values for the same key (sans timestamp) are written to +Accumulo, it is strongly recommended that the value in the timestamp properly reflects the intended version by +the client. That is to say, newer values inserted into the table should have larger timestamps. If the time between +writing updates to the same key is significant (order minutes), this concern can likely be ignored. + +==== Bulk Imports + +Currently, files that are bulk imported into a table configured for replication are not replicated. There is no +technical reason why it was not implemented, it was simply omitted from the initial implementation. This is considered a +fair limitation because bulk importing generated files multiple locations is much simpler than bifurcating "live" ingest +data into two instances. Given some existing bulk import process which creates files and them imports them into an +Accumulo instance, it is trivial to copy those files to a new HDFS instance and import them into another Accumulo +instance using the same process. Hadoop's +distcp+ command provides an easy way to copy large amounts of data to another +HDFS instance which makes the problem of duplicating bulk imports very easy to solve. http://git-wip-us.apache.org/repos/asf/accumulo-website/blob/32ac61d3/docs/master/sampling.md ---------------------------------------------------------------------- diff --git a/docs/master/sampling.md b/docs/master/sampling.md new file mode 100644 index 0000000..237f43c --- /dev/null +++ b/docs/master/sampling.md @@ -0,0 +1,86 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +== Sampling + +=== Overview + +Accumulo has the ability to generate and scan a per table set of sample data. +This sample data is kept up to date as a table is mutated. What key values are +placed in the sample data is configurable per table. + +This feature can be used for query estimation and optimization. For an example +of estimation assume an Accumulo table is configured to generate a sample +containing one millionth of a tables data. If a query is executed against the +sample and returns one thousand results, then the same query against all the +data would probably return a billion results. A nice property of having +Accumulo generate the sample is that its always up to date. So estimations +will be accurate even when querying the most recently written data. + +An example of a query optimization is an iterator using sample data to get an +estimate, and then making decisions based on the estimate. + +=== Configuring + +Inorder to use sampling, an Accumulo table must be configured with a class that +implements +org.apache.accumulo.core.sample.Sampler+ along with options for +that class. For guidance on implementing a Sampler see that interface's +javadoc. Accumulo provides a few implementations out of the box. For +information on how to use the samplers that ship with Accumulo look in the +package `org.apache.accumulo.core.sample` and consult the javadoc of the +classes there. See the https://github.com/apache/accumulo-examples/blob/master/docs/sample.md[sampling example] +for examples of how to configure a Sampler on a table. + +Once a table is configured with a sampler all writes after that point will +generate sample data. For data written before sampling was configured sample +data will not be present. A compaction can be initiated that only compacts the +files in the table that do not have sample data. The example readme shows how +to do this. + +If the sampling configuration of a table is changed, then Accumulo will start +generating new sample data with the new configuration. However old data will +still have sample data generated with the previous configuration. A selective +compaction can also be issued in this case to regenerate the sample data. + +=== Scanning sample data + +Inorder to scan sample data, use the +setSamplerConfiguration(...)+ method on ++Scanner+ or +BatchScanner+. Please consult this methods javadocs for more +information. + +Sample data can also be scanned from within an Accumulo +SortedKeyValueIterator+. +To see how to do this, look at the example iterator referenced in the +https://github.com/apache/accumulo-examples/blob/master/docs/sample.md[sampling example]. +Also, consult the javadoc on +org.apache.accumulo.core.iterators.IteratorEnvironment.cloneWithSamplingEnabled()+. + +Map reduce jobs using the +AccumuloInputFormat+ can also read sample data. See +the javadoc for the +setSamplerConfiguration()+ method on ++AccumuloInputFormat+. + +Scans over sample data will throw a +SampleNotPresentException+ in the following cases : + +. sample data is not present, +. sample data is present but was generated with multiple configurations +. sample data is partially present + +So a scan over sample data can only succeed if all data written has sample data +generated with the same configuration. + +=== Bulk import + +When generating rfiles to bulk import into Accumulo, those rfiles can contain +sample data. To use this feature, look at the javadoc on the ++AccumuloFileOutputFormat.setSampler(...)+ method. + http://git-wip-us.apache.org/repos/asf/accumulo-website/blob/32ac61d3/docs/master/security.md ---------------------------------------------------------------------- diff --git a/docs/master/security.md b/docs/master/security.md new file mode 100644 index 0000000..4a57f8a --- /dev/null +++ b/docs/master/security.md @@ -0,0 +1,182 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +== Security + +Accumulo extends the BigTable data model to implement a security mechanism +known as cell-level security. Every key-value pair has its own security label, stored +under the column visibility element of the key, which is used to determine whether +a given user meets the security requirements to read the value. This enables data of +various security levels to be stored within the same row, and users of varying +degrees of access to query the same table, while preserving data confidentiality. + +=== Security Label Expressions + +When mutations are applied, users can specify a security label for each value. This is +done as the Mutation is created by passing a ColumnVisibility object to the put() +method: + +[source,java] +---- +Text rowID = new Text("row1"); +Text colFam = new Text("myColFam"); +Text colQual = new Text("myColQual"); +ColumnVisibility colVis = new ColumnVisibility("public"); +long timestamp = System.currentTimeMillis(); + +Value value = new Value("myValue"); + +Mutation mutation = new Mutation(rowID); +mutation.put(colFam, colQual, colVis, timestamp, value); +---- + +=== Security Label Expression Syntax + +Security labels consist of a set of user-defined tokens that are required to read the +value the label is associated with. The set of tokens required can be specified using +syntax that supports logical AND +&+ and OR +|+ combinations of terms, as +well as nesting groups +()+ of terms together. + +Each term is comprised of one to many alpha-numeric characters, hyphens, underscores or +periods. Optionally, each term may be wrapped in quotation marks +which removes the restriction on valid characters. In quoted terms, quotation marks +and backslash characters can be used as characters in the term by escaping them +with a backslash. + +For example, suppose within our organization we want to label our data values with +security labels defined in terms of user roles. We might have tokens such as: + + admin + audit + system + +These can be specified alone or combined using logical operators: + +---- +// Users must have admin privileges +admin + +// Users must have admin and audit privileges +admin&audit + +// Users with either admin or audit privileges +admin|audit + +// Users must have audit and one or both of admin or system +(admin|system)&audit +---- + +When both +|+ and +&+ operators are used, parentheses must be used to specify +precedence of the operators. + +=== Authorization + +When clients attempt to read data from Accumulo, any security labels present are +examined against the set of authorizations passed by the client code when the +Scanner or BatchScanner are created. If the authorizations are determined to be +insufficient to satisfy the security label, the value is suppressed from the set of +results sent back to the client. + +Authorizations are specified as a comma-separated list of tokens the user possesses: + +[source,java] +---- +// user possesses both admin and system level access +Authorization auths = new Authorization("admin","system"); + +Scanner s = connector.createScanner("table", auths); +---- + +=== User Authorizations + +Each Accumulo user has a set of associated security labels. To manipulate +these in the shell while using the default authorizor, use the setuaths and getauths commands. +These may also be modified for the default authorizor using the java security operations API. + +When a user creates a scanner a set of Authorizations is passed. If the +authorizations passed to the scanner are not a subset of the users +authorizations, then an exception will be thrown. + +To prevent users from writing data they can not read, add the visibility +constraint to a table. Use the -evc option in the createtable shell command to +enable this constraint. For existing tables use the following shell command to +enable the visibility constraint. Ensure the constraint number does not +conflict with any existing constraints. + + config -t table -s table.constraint.1=org.apache.accumulo.core.security.VisibilityConstraint + +Any user with the alter table permission can add or remove this constraint. +This constraint is not applied to bulk imported data, if this a concern then +disable the bulk import permission. + +=== Pluggable Security + +New in 1.5 of Accumulo is a pluggable security mechanism. It can be broken into three actions -- +authentication, authorization, and permission handling. By default all of these are handled in +Zookeeper, which is how things were handled in Accumulo 1.4 and before. It is worth noting at this +point, that it is a new feature in 1.5 and may be adjusted in future releases without the standard +deprecation cycle. + +Authentication simply handles the ability for a user to verify their integrity. A combination of +principal and authentication token are used to verify a user is who they say they are. An +authentication token should be constructed, either directly through its constructor, but it is +advised to use the +init(Property)+ method to populate an authentication token. It is expected that a +user knows what the appropriate token to use for their system is. The default token is ++PasswordToken+. + +Once a user is authenticated by the Authenticator, the user has access to the other actions within +Accumulo. All actions in Accumulo are ACLed, and this ACL check is handled by the Permission +Handler. This is what manages all of the permissions, which are divided in system and per table +level. From there, if a user is doing an action which requires authorizations, the Authorizor is +queried to determine what authorizations the user has. + +This setup allows a variety of different mechanisms to be used for handling different aspects of +Accumulo's security. A system like Kerberos can be used for authentication, then a system like LDAP +could be used to determine if a user has a specific permission, and then it may default back to the +default ZookeeperAuthorizor to determine what Authorizations a user is ultimately allowed to use. +This is a pluggable system so custom components can be created depending on your need. + +=== Secure Authorizations Handling + +For applications serving many users, it is not expected that an Accumulo user +will be created for each application user. In this case an Accumulo user with +all authorizations needed by any of the applications users must be created. To +service queries, the application should create a scanner with the application +user's authorizations. These authorizations could be obtained from a trusted 3rd +party. + +Often production systems will integrate with Public-Key Infrastructure (PKI) and +designate client code within the query layer to negotiate with PKI servers in order +to authenticate users and retrieve their authorization tokens (credentials). This +requires users to specify only the information necessary to authenticate themselves +to the system. Once user identity is established, their credentials can be accessed by +the client code and passed to Accumulo outside of the reach of the user. + +=== Query Services Layer + +Since the primary method of interaction with Accumulo is through the Java API, +production environments often call for the implementation of a Query layer. This +can be done using web services in containers such as Apache Tomcat, but is not a +requirement. The Query Services Layer provides a mechanism for providing a +platform on which user facing applications can be built. This allows the application +designers to isolate potentially complex query logic, and enables a convenient point +at which to perform essential security functions. + +Several production environments choose to implement authentication at this layer, +where users identifiers are used to retrieve their access credentials which are then +cached within the query layer and presented to Accumulo through the +Authorizations mechanism. + +Typically, the query services layer sits between Accumulo and user workstations. http://git-wip-us.apache.org/repos/asf/accumulo-website/blob/32ac61d3/docs/master/shell.md ---------------------------------------------------------------------- diff --git a/docs/master/shell.md b/docs/master/shell.md new file mode 100644 index 0000000..4d53e3d --- /dev/null +++ b/docs/master/shell.md @@ -0,0 +1,159 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +== Accumulo Shell +Accumulo provides a simple shell that can be used to examine the contents and +configuration settings of tables, insert/update/delete values, and change +configuration settings. + +The shell can be started by the following command: + + accumulo shell -u [username] + +The shell will prompt for the corresponding password to the username specified +and then display the following prompt: + + Shell - Apache Accumulo Interactive Shell + - + - version: 2.x.x + - instance name: myinstance + - instance id: 00000000-0000-0000-0000-000000000000 + - + - type 'help' for a list of available commands + - + root@myinstance> + +=== Basic Administration + +The Accumulo shell can be used to create and delete tables, as well as to configure +table and instance specific options. + +---- +root@myinstance> tables +accumulo.metadata +accumulo.root + +root@myinstance> createtable mytable + +root@myinstance mytable> + +root@myinstance mytable> tables +accumulo.metadata +accumulo.root +mytable + +root@myinstance mytable> createtable testtable + +root@myinstance testtable> + +root@myinstance testtable> deletetable testtable +deletetable { testtable } (yes|no)? yes +Table: [testtable] has been deleted. + +root@myinstance> +---- + +The Shell can also be used to insert updates and scan tables. This is useful for +inspecting tables. + +---- +root@myinstance mytable> scan + +root@myinstance mytable> insert row1 colf colq value1 +insert successful + +root@myinstance mytable> scan +row1 colf:colq [] value1 +---- + +The value in brackets ``[]'' would be the visibility labels. Since none were used, this is empty for this row. +You can use the +-st+ option to scan to see the timestamp for the cell, too. + +=== Table Maintenance + +The *compact* command instructs Accumulo to schedule a compaction of the table during which +files are consolidated and deleted entries are removed. + + root@myinstance mytable> compact -t mytable + 07 16:13:53,201 [shell.Shell] INFO : Compaction of table mytable started for given range + +The *flush* command instructs Accumulo to write all entries currently in memory for a given table +to disk. + + root@myinstance mytable> flush -t mytable + 07 16:14:19,351 [shell.Shell] INFO : Flush of table mytable + initiated... + +=== User Administration + +The Shell can be used to add, remove, and grant privileges to users. + +---- +root@myinstance mytable> createuser bob +Enter new password for 'bob': ********* +Please confirm new password for 'bob': ********* + +root@myinstance mytable> authenticate bob +Enter current password for 'bob': ********* +Valid + +root@myinstance mytable> grant System.CREATE_TABLE -s -u bob + +root@myinstance mytable> user bob +Enter current password for 'bob': ********* + +bob@myinstance mytable> userpermissions +System permissions: System.CREATE_TABLE +Table permissions (accumulo.metadata): Table.READ +Table permissions (mytable): NONE + +bob@myinstance mytable> createtable bobstable + +bob@myinstance bobstable> + +bob@myinstance bobstable> user root +Enter current password for 'root': ********* + +root@myinstance bobstable> revoke System.CREATE_TABLE -s -u bob +---- + +=== JSR-223 Support in the Shell + +The script command can be used to invoke programs written in languages supported by installed JSR-223 +engines. You can get a list of installed engines with the -l argument. Below is an example of the output +of the command when running the Shell with Java 7. + +---- +root@fake> script -l + Engine Alias: ECMAScript + Engine Alias: JavaScript + Engine Alias: ecmascript + Engine Alias: javascript + Engine Alias: js + Engine Alias: rhino + Language: ECMAScript (1.8) + Script Engine: Mozilla Rhino (1.7 release 3 PRERELEASE) +ScriptEngineFactory Info +---- + + A list of compatible languages can be found at https://en.wikipedia.org/wiki/List_of_JVM_languages. The +rhino javascript engine is provided with the JVM. Typically putting a jar on the classpath is all that is +needed to install a new engine. + + When writing scripts to run in the shell, you will have a variable called connection already available +to you. This variable is a reference to an Accumulo Connector object, the same connection that the Shell +is using to communicate with the Accumulo servers. At this point you can use any of the public API methods +within your script. Reference the script command help to see all of the execution options. Script and script +invocation examples can be found in ACCUMULO-1399.