Instead of specifying host name of server principal, have you tried to use hdfs/[email protected]?
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SecureMode.html#Kerberos_principals_for_Hadoop_Daemons <http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SecureMode.html#Kerberos_principals_for_Hadoop_Daemons> > dfs.journalnode.kerberos.principal</name> > <value>hdfs/[email protected]</value> Wei-Chiu Chuang A very happy Clouderan > On Oct 20, 2016, at 10:19 AM, Mark Selby <[email protected]> wrote: > > We have an existing CDH 5.5.1 cluster with simple authentication and no > authorization. We are building out a new cluster and plan to move to CDH > 5.8.2 wiith Kerberos based authentication. We have an existing MIT Kerberos > infrastructure which we sucessfully use for a variety of services. > (ssh,apache,postfix) > > I am very confident that out /etc/krb5.conf and name resolution is working. I > have even used HadoopDNSVerifier-1.0.jar to verify that java sees the same > name canonicalization that we see. > > I have built and test cluster and closely followed the instructions on the > secure hadoop install doc from the clodera site making sure that all the conf > files are properly edited and all the Kerberos keytabs contain the correct > principals and have the correct permissions. > > We are using HA namenodes with Quorm based journalmanagers > > I am running into a persistent problem with many hadoop compents when they > need to talk securely to remote servers. The two example that I post here are > the namenode needing to talk to remote journalnodes and command line hdfs > client needing to speak to a remote namenode. Both give the same error > > Server has invalid Kerberos principal: > hdfs/[email protected]; Host Details : local host is: > "aw1hdnn001.tnbsound.com/10.132.8.19"; destination host is: > "aw1hdnn002.tnbsound.com":8020; > > There is not much on the inter-webs about this and the error that is showing > up is leading me to belive that the issue is aroung the kerberos realm being > used in one place and not the other. > > I just can not seem to figure out what is going on here as I know these are > vaild principals. I have added a snippet at the end where I have enabled > kerberos debugging to see if that helps at all > > The weird part is that this error applies only to remote daemons. The local > namenode and journal node does not have the issue. We can “speak” locally but > not remotely. > > All and Any help is greatly appreciated > > # > # This is me with hdfs kerberos credentials trying to run hdfs dfsadmin > -refreshServiceAcl > # > > hdfs@aw1hdnn001 /var/lib/hadoop-hdfs 53$ klist > Ticket cache: FILE:/tmp/krb5cc_115 > Default principal: hdfs/[email protected] > Valid starting Expires Service principal > 10/20/2016 15:34:49 10/21/2016 15:34:49 krbtgt/[email protected] > renew until 10/27/2016 15:34:49 > > hdfs@aw1hdnn001 /var/lib/hadoop-hdfs 54$ hdfs dfsadmin -refreshServiceAcl > Refresh service acl successful for aw1hdnn001.tnbsound.com/10.132.8.19:8020 > refreshServiceAcl: Failed on local exception: java.io.IOException: > java.lang.IllegalArgumentException: Server has invalid Kerberos principal: > hdfs/[email protected]; Host Details : local host is: > "aw1hdnn001.tnbsound.com/10.132.8.19"; destination host is: > "aw1hdnn002.tnbsound.com":8020; > > # > # This is the namenode trying to start up and contant and off server > jornalnode > # > 2016-10-20 16:51:40,703 WARN org.apache.hadoop.security.UserGroupInformation: > PriviledgedActionException as:hdfs/[email protected] > (auth:KERBEROS) cause:java.io.IOException: > java.lang.IllegalArgumentException: Server has invalid Kerberos principal: > hdfs/[email protected] > 10.132.8.21:8485: Failed on local exception: java.io.IOException: > java.lang.IllegalArgumentException: Server has invalid Kerberos principal: > hdfs/[email protected]; Host Details : local host is: > "aw1hdnn001.tnbsound.com/10.132.8.19"; destination host is: > "aw1hdrm001.tnbsound.com":8485; > > # > # This is me with hdfs kerberos credentials trying to run hdfs dfsadmin > -refreshServiceAcl with debug into > # > hdfs@aw1hdnn001 /var/lib/hadoop-hdfs 46$ > HADOOP_OPTS="-Dsun.security.krb5.debug=true" hdfs dfsadmin -refreshServiceAcl > Java config name: null > Native config name: /etc/krb5.conf > Loaded from native config > >>>KinitOptions cache name is /tmp/krb5cc_115 > >>>DEBUG <CCacheInputStream> client principal is > >>>hdfs/[email protected] > >>>DEBUG <CCacheInputStream> server principal is > >>>krbtgt/[email protected] > >>>DEBUG <CCacheInputStream> key type: 18 > >>>DEBUG <CCacheInputStream> auth time: Thu Oct 20 16:55:42 UTC 2016 > >>>DEBUG <CCacheInputStream> start time: Thu Oct 20 16:55:42 UTC 2016 > >>>DEBUG <CCacheInputStream> end time: Fri Oct 21 16:55:42 UTC 2016 > >>>DEBUG <CCacheInputStream> renew_till time: Thu Oct 27 16:55:42 UTC 2016 > >>> CCacheInputStream: readFlags() FORWARDABLE; PROXIABLE; RENEWABLE; > >>> INITIAL; PRE_AUTH; > >>>DEBUG <CCacheInputStream> client principal is > >>>hdfs/[email protected] > >>>DEBUG <CCacheInputStream> server principal is > >>>X-CACHECONF:/krb5_ccache_conf_data/fast_avail/krbtgt/[email protected] > >>>DEBUG <CCacheInputStream> key type: 0 > >>>DEBUG <CCacheInputStream> auth time: Thu Jan 01 00:00:00 UTC 1970 > >>>DEBUG <CCacheInputStream> start time: null > >>>DEBUG <CCacheInputStream> end time: Thu Jan 01 00:00:00 UTC 1970 > >>>DEBUG <CCacheInputStream> renew_till time: null > >>> CCacheInputStream: readFlags() > >>>DEBUG <CCacheInputStream> client principal is > >>>hdfs/[email protected] > >>>DEBUG <CCacheInputStream> server principal is > >>>X-CACHECONF:/krb5_ccache_conf_data/pa_type/krbtgt/[email protected] > >>>DEBUG <CCacheInputStream> key type: 0 > >>>DEBUG <CCacheInputStream> auth time: Thu Jan 01 00:00:00 UTC 1970 > >>>DEBUG <CCacheInputStream> start time: null > >>>DEBUG <CCacheInputStream> end time: Thu Jan 01 00:00:00 UTC 1970 > >>>DEBUG <CCacheInputStream> renew_till time: null > >>> CCacheInputStream: readFlags() > Found ticket for hdfs/[email protected] to go to > krbtgt/[email protected] expiring on Fri Oct 21 16:55:42 UTC 2016 > Entered Krb5Context.initSecContext with state=STATE_NEW > Found ticket for hdfs/[email protected] to go to > krbtgt/[email protected] expiring on Fri Oct 21 16:55:42 UTC 2016 > Service ticket not found in the subject > >>> Credentials acquireServiceCreds: same realm > Using builtin default etypes for default_tgs_enctypes > default etypes for default_tgs_enctypes: 18 17 16 23 1 3. > >>> CksumType: sun.security.krb5.internal.crypto.RsaMd5CksumType > >>> EType: sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType > >>> KdcAccessibility: reset > >>> KrbKdcReq send: kdc=dc1util003.tnbsound.com UDP:88, timeout=30000, number > >>> of retries =3, #bytes=734 > >>> KDCCommunication: kdc=dc1util003.tnbsound.com UDP:88, > >>> timeout=30000,Attempt =1, #bytes=734 > >>> KrbKdcReq send: #bytes read=721 > >>> KdcAccessibility: remove dc1util003.tnbsound.com > >>> EType: sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType > >>> KrbApReq: APOptions are 00100000 00000000 00000000 00000000 > >>> EType: sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType > Krb5Context setting mySeqNumber to: 561537595 > Created InitSecContextToken: > 0000: 01 00 6E 82 02 7F 30 82 02 7B A0 03 02 01 05 A1 ..n...0......... > 0010: 03 02 01 0E A2 07 03 05 00 20 00 00 00 A3 82 01 ......... ...... > 0020: 7A 61 82 01 76 30 82 01 72 A0 03 02 01 05 A1 0E za..v0..r....... > 0030: 1B 0C 54 4E 42 53 4F 55 4E 44 2E 43 4F 4D A2 2A ..TNBSOUND.COM.* > 0040: 30 28 A0 03 02 01 00 A1 21 30 1F 1B 04 68 64 66 0(......!0...hdf > 0050: 73 1B 17 61 77 31 68 64 6E 6E 30 30 31 2E 74 6E s..aw1hdnn001.tn > 0060: 62 73 6F 75 6E 64 2E 63 6F 6D A3 82 01 2D 30 82 bsound.com...-0. > 0070: 01 29 A0 03 02 01 12 A1 03 02 01 01 A2 82 01 1B .).............. > 0080: 04 82 01 17 04 6E 26 46 08 EA 9C 61 08 80 B8 4B .....n&F...a...K > 0090: AF 7C D2 CD 5E 47 19 3D A1 FB CD 8D 41 F4 C9 49 ....^G.=....A..I > 00A0: 09 95 1C C7 9A D8 1B 92 0F 3C E0 5F 41 BF 99 96 .........<._A... > 00B0: 42 A9 2D 17 D6 F0 AB 41 72 3E 7E F7 13 33 E2 0A B.-....Ar>...3.. > 00C0: 2D F5 71 AD 97 9A 9D 7F E0 EA 1A 29 7C D4 47 AB -.q........)..G. > 00D0: B4 7E C1 A1 C5 28 DD 46 F1 C4 17 0B FC DB C9 D3 .....(.F........ > 00E0: F4 4D C2 1F 6C 59 A6 C4 9E 9D FD 56 E3 B0 31 E6 .M..lY.....V..1. > 00F0: C6 6E 50 44 2C 07 44 91 40 F7 C8 6E AD 1E FB 26 .nPD,[email protected]...& > 0100: EC 6D E4 ED BC F8 15 17 0B 31 B6 4B 68 64 03 E4 .m.......1.Khd.. > 0110: 28 9B A5 9D AE 2A DF 1B BD 0F B2 AE B3 BB E0 4D (....*.........M > 0120: 14 D1 9C E0 AC 99 59 1B B6 28 22 E2 B5 55 52 58 ......Y..("..URX > 0130: D2 61 39 DE 8F C8 3F E6 6F EB 41 5D E1 F2 43 40 .a9...?.o.A]..C@ > 0140: 8F AC 78 C8 09 35 7B BA 39 6B CD C6 01 7B 90 0B ..x..5..9k...... > 0150: 20 0C 49 0D 8B E5 2B F1 E6 6F 38 4E EA DF 5C A9 .I...+..o8N..\. > 0160: 40 AE 11 75 AE B2 E2 35 13 A8 CE CF E7 F5 92 CB @..u...5........ > 0170: A5 66 53 47 92 5A EF 31 CD 60 CD 67 46 D0 B7 0D .fSG.Z.1.`.gF... > 0180: B6 76 FE 09 B1 03 16 FE B8 57 6E 08 9A E6 DD F8 .v.......Wn..... > 0190: D3 AA 00 54 6C D4 70 61 95 08 CF A4 81 E7 30 81 ...Tl.pa......0. > 01A0: E4 A0 03 02 01 12 A2 81 DC 04 81 D9 4E 48 9E 35 ............NH.5 > 01B0: 57 7C 7C 54 1C 9F 41 FE F3 C0 94 07 E2 D8 EE 38 W..T..A........8 > 01C0: BA 4A DA 97 43 04 B5 96 F6 A9 34 FD 54 FF 7B 96 .J..C.....4.T... > 01D0: DA DD A9 6F C4 7B A5 E4 50 9F 9E 1A 62 D3 F3 3C ...o....P...b..< > 01E0: 50 50 E9 02 05 F2 37 52 4D BC 86 D8 2B A4 9F FE PP....7RM...+... > 01F0: 97 4C 01 7F E6 B4 8B 66 1F 6E 63 FD 3F EF 57 E9 .L.....f.nc.?.W. > 0200: 04 E9 BE 28 4C 03 BC 26 EB EF EC DC 8C 48 C0 51 ...(L..&.....H.Q > 0210: 7B 2B 5B 0F 16 7C 83 D0 73 F9 2A 94 CF 67 F2 F8 .+[.....s.*..g.. > 0220: 11 CC 2B E9 0D FE 95 F5 7E 2B C4 40 19 FE FE 6F [email protected] > 0230: B7 C4 B8 7E 87 D1 0A 98 8A F2 B0 1A DF FA 27 24 ..............'$ > 0240: C2 EE 06 FE 3F 36 57 3D 6C B9 F3 18 98 19 D6 A1 ....?6W=l....... > 0250: F4 49 57 5D 58 6E 88 C9 2E 1F FA 7D 53 24 B9 67 .IW]Xn......S$.g > 0260: 02 85 C2 2C 01 25 18 BA BF 0E 64 A2 C3 06 7D AC ...,.%....d..... > 0270: D6 11 A6 F4 ED 47 71 22 CC D4 E8 54 08 17 51 E6 .....Gq"...T..Q. > 0280: EE 6F FE 31 37 .o.17 > Entered Krb5Context.initSecContext with state=STATE_IN_PROCESS > >>> EType: sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType > Krb5Context setting peerSeqNumber to: 374605590 > Krb5Context.unwrap: token=[05 04 01 ff 00 0c 00 00 00 00 00 00 16 54 07 16 01 > 01 00 00 c5 67 32 c5 74 d0 68 ef 82 46 a8 85 ] > Krb5Context.unwrap: data=[01 01 00 00 ] > Krb5Context.wrap: data=[01 01 00 00 ] > Krb5Context.wrap: token=[05 04 00 ff 00 0c 00 00 00 00 00 00 21 78 62 3b 01 > 01 00 00 a1 51 c9 92 95 bd cd 88 66 59 b7 49 ] > Refresh service acl successful for aw1hdnn001.tnbsound.com/10.132.8.19:8020 > refreshServiceAcl: Failed on local exception: java.io.IOException: > java.lang.IllegalArgumentException: Server has invalid Kerberos principal: > hdfs/[email protected]; Host Details : local host is: > "aw1hdnn001.tnbsound.com/10.132.8.19"; destination host is: > "aw1hdnn002.tnbsound.com":8020; > > # > # hdfs-site.xml > # > > <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> > <configuration> > > <!-- --> > <!-- HDFS security --> > <!-- --> > > <property> > <name>dfs.block.access.token.enable</name> > <value>true</value> > </property> > > <!-- --> > <!-- HA namespace --> > <!-- --> > > <property> > <name>dfs.nameservices</name> > <value>nbs-aw1-test</value> > </property> > > <!-- --> > <!-- HA namenodes --> > <!-- --> > > <property> > <name>dfs.ha.namenodes.nbs-aw1-test</name> > <value>nn1,nn2</value> > </property> > > <property> > <name>dfs.namenode.rpc-address.nbs-aw1-test.nn1</name> > <value>aw1hdnn001.tnbsound.com:8020</value> > </property> > > <property> > <name>dfs.namenode.http-address.nbs-aw1-test.nn1</name> > <value>aw1hdnn001.tnbsound.com:50070</value> > </property> > > <property> > <name>dfs.namenode.rpc-address.nbs-aw1-test.nn2</name> > <value>aw1hdnn002.tnbsound.com:8020</value> > </property> > > <property> > <name>dfs.namenode.http-address.nbs-aw1-test.nn2</name> > <value>aw1hdnn002.tnbsound.com:50070</value> > </property> > > <!-- --> > <!-- FS image dir --> > <!-- --> > > <property> > <name>dfs.namenode.name.dir</name> > <value>/var/lib/hadoop-hdfs/dfs/name</value> > </property> > > <!-- --> > <!-- QJM config --> > <!-- --> > > <property> > <name>dfs.namenode.shared.edits.dir</name> > > <value>qjournal://aw1hdnn001.tnbsound.com:8485;aw1hdnn002.tnbsound.com:8485;aw1hdrm001.tnbsound.com:8485/nbs-aw1-test</value> > </property> > > <property> > <name>dfs.journalnode.edits.dir</name> > <value>/var/lib/hadoop-hdfs/dfs/journal</value> > </property> > > <!-- --> > <!-- JournalNode security --> > <!-- --> > > <property> > <name>dfs.journalnode.keytab.file</name> > <value>/etc/krb5/hdfs.keytab</value> > </property> > > <property> > <name>dfs.journalnode.kerberos.principal</name> > <value>hdfs/[email protected]</value> > </property> > > <property> > <name>dfs.journalnode.kerberos.internal.spnego.principal</name> > <value>HTTP/[email protected]</value> > </property> > > <!-- --> > <!-- Namenode failover --> > <!-- --> > > <property> > <name>dfs.client.failover.proxy.provider.nbs-aw1-test</name> > > <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> > </property> > > <property> > <name>dfs.ha.automatic-failover.enabled</name> > <value>true</value> > </property> > > <property> > <name>dfs.ha.fencing.methods</name> > <value>sshfence > shell(/bin/true)</value> > </property> > > <property> > <name>dfs.ha.fencing.ssh.private-key-files</name> > <value>/var/lib/hadoop-hdfs/.ssh/id_rsa</value> > </property> > > <property> > <name>dfs.ha.fencing.ssh.connect-timeout</name> > <value>3000</value> > </property> > > <property> > <name>ha.zookeeper.quorum</name> > > <value>aw1zook001.tnbsound.com:2181,aw1zook002.tnbsound.com:2181,aw1zook003.tnbsound.com:2181</value> > </property> > > <!-- --> > <!-- NameNode security --> > <!-- --> > > <property> > <name>dfs.namenode.keytab.file</name> > <value>/etc/krb5/hdfs.keytab</value> > </property> > > <property> > <name>dfs.namenode.kerberos.principal</name> > <value>hdfs/[email protected]</value> > </property> > > <property> > <name>dfs.namenode.kerberos.internal.spnego.principal</name> > <value>HTTP/[email protected]</value> > </property> > > <!-- --> > <!-- Datanode --> > <!-- --> > > <property> > <name>dfs.datanode.data.dir</name> > > <value>/data01/hadoop-hdfs/dfs/data,/data02/hadoop-hdfs/dfs/data,/data03/hadoop-hdfs/dfs/data,/data04/hadoop-hdfs/dfs/data</value> > </property> > > <property> > <name>dfs.datanode.failed.volumes.tolerated</name> > <value>0</value> > </property> > > <property> > <name>dfs.datanode.fsdataset.volume.choosing.policy</name> > > <value>org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy</value> > </property> > > <property> > > <name>dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold</name> > <value>107374182400</value> > </property> > > <property> > > <name>dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction</name> > <value>0.75</value> > </property> > > <!-- --> > <!-- DataNode security --> > <!-- --> > > <property> > <name>dfs.datanode.data.dir.perm</name> > <value>700</value> > </property> > > <property> > <name>dfs.datanode.keytab.file</name> > <value>/etc/krb5/hdfs.keytab</value> > </property> > > <property> > <name>dfs.datanode.kerberos.principal</name> > <value>hdfs/[email protected]</value> > </property> > > <property> > <name>dfs.datanode.address</name> > <value>0.0.0.0:1004</value> > </property> > > <property> > <name>dfs.datanode.http.address</name> > <value>0.0.0.0:1006</value> > </property> > > <!-- --> > <!-- Misc --> > <!-- --> > > <property> > <name>dfs.replication</name> > <value>3</value> > </property> > > <property> > <name>dfs.permissions.superusergroup</name> > <value>hadoop</value> > </property> > > <property> > <name>dfs.webhdfs.enabled</name> > <value>true</value> > </property> > > <property> > <name>dfs.hosts.exclude</name> > <value>/etc/hadoop/conf/hosts.exclude</value> > <final>true</final> > </property> > > <!-- > From O'Reilly Hadoop Operations: A general guideline for setting > dfs.namenode.handler.count is to make it the natural logarithm of > the number of cluster nodes times 20 (as a whole number). python -c > 'import math ; print int(math.log(num_of_nodes) * 20)' > --> > <property> > <name>dfs.namenode.handler.count</name> > <value>24</value> > </property> > > <!-- --> > <!-- Web security --> > <!-- --> > > <property> > <name>dfs.web.authentication.kerberos.keytab</name> > <value>/etc/krb5/hdfs.keytab</value> > </property> > > <property> > <name>dfs.web.authentication.kerberos.principal</name> > <value>HTTP/[email protected]</value> > </property> > > <property> > <name>dfs.http.policy</name> > <value>HTTP_ONLY</value> > </property> > > </configuration> >
