This is an automated email from the ASF dual-hosted git repository.
morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push:
new 564da31da1b [Kerberos]Add Kerberos FAQ (#3282)
564da31da1b is described below
commit 564da31da1b0af4a174b5398fef440fae7197088
Author: Calvin Kirs <[email protected]>
AuthorDate: Wed Jan 14 15:36:31 2026 +0800
[Kerberos]Add Kerberos FAQ (#3282)
---
docs/lakehouse/best-practices/kerberos.md | 74 +++++++++++++++++++++
docs/lakehouse/metastores/hive-metastore.md | 2 +-
docs/lakehouse/storages/hdfs.md | 1 +
.../current/lakehouse/best-practices/kerberos.md | 74 +++++++++++++++++++++
.../current/lakehouse/metastores/hive-metastore.md | 2 +-
.../current/lakehouse/storages/hdfs.md | 2 +
.../lakehouse/best-practices/kerberos.md | 74 +++++++++++++++++++++
.../lakehouse/metastores/hive-metastore.md | 2 +-
.../version-3.x/lakehouse/storages/hdfs.md | 2 +
.../lakehouse/best-practices/kerberos.md | 74 +++++++++++++++++++++
.../lakehouse/metastores/hive-metastore.md | 2 +-
.../version-4.x/lakehouse/storages/hdfs.md | 2 +
.../lakehouse/best-practices/kerberos.md | 75 ++++++++++++++++++++++
.../lakehouse/metastores/hive-metastore.md | 2 +-
.../version-3.x/lakehouse/storages/hdfs.md | 1 +
.../lakehouse/best-practices/kerberos.md | 75 ++++++++++++++++++++++
.../lakehouse/metastores/hive-metastore.md | 2 +-
.../version-4.x/lakehouse/storages/hdfs.md | 1 +
18 files changed, 461 insertions(+), 6 deletions(-)
diff --git a/docs/lakehouse/best-practices/kerberos.md
b/docs/lakehouse/best-practices/kerberos.md
index 6e2063e10ef..093571befd5 100644
--- a/docs/lakehouse/best-practices/kerberos.md
+++ b/docs/lakehouse/best-practices/kerberos.md
@@ -263,3 +263,77 @@ Or directly use 127.0.0.1 (provided that the service has
been mapped to the host
```
At this point, the multi-Kerberos cluster access configuration is complete.
You can view data from both Hive clusters and use different Kerberos
credentials.
+## FAQ
+1. javax.security.sasl.SaslException: No common protection layer between
client and server
+ - Cause: The client's `hadoop.rpc.protection` differs from the HDFS cluster
setting.
+ - Fix: Align `hadoop.rpc.protection` between the client and HDFS server.
+
+2. No valid credentials provided (Mechanism level: Illegal key size)
+ - Cause: Java by default does not support encryption keys larger than 128
bits.
+ - Fix: Install the Java Cryptography Extension (JCE) Unlimited Strength
Policy; unpack the JARs into `$JAVA_HOME/jre/lib/security` and restart services.
+
+3. Encryption type AES256 CTS mode with HMAC SHA1-96 is not supported/enabled
+ - Cause: The current Java environment lacks AES256 support while Kerberos
may use it.
+ - Fix: Update `/etc/krb5.conf` in `[libdefaults]` to use a supported
cipher, or install the JCE extension to enable AES256 (same as above).
+
+4. No valid credentials provided (Mechanism level: Failed to find any Kerberos
tgt)
+ - Cause: Kerberos cannot find a valid Ticket Granting Ticket (TGT). In a
previously working setup, the ticket expired or the KDC restarted. In a new
setup, `krb5.conf` or the keytab is incorrect or corrupted.
+ - Fix: Verify `krb5.conf` and the keytab, ensure tickets are valid, and try
`kinit` to obtain a new ticket.
+
+5. Failure unspecified at GSS-API level (Mechanism level: Checksum failed)
+ - Cause: GSS-API checksum failure; wrong password used with `kinit`; keytab
is invalid or has an outdated key version so JVM falls back to password login.
+ - Fix: Use the correct password with `kinit` and ensure the keytab is
current and valid.
+
+6. Receive timed out
+ - Cause: Using UDP to talk to the KDC on unstable networks or with large
packets.
+ - Fix: Force Kerberos to use TCP by adding in `/etc/krb5.conf`:
+```shell
+[libdefaults]
+udp_preference_limit = 1
+```
+
+7. javax.security.auth.login.LoginException: Unable to obtain password from
user
+ - Cause: Principal does not match the keytab, or the application cannot
read `krb5.conf` or the keytab.
+ - Fix:
+ - Use `klist -kt <keytab_file>` and `kinit -kt <keytab_file>
<principal>` to validate the keytab and principal.
+ - Check paths and permissions for `krb5.conf` and the keytab so the
runtime user can read them.
+ - Ensure JVM startup options specify the correct config paths.
+
+8. Principal not found or Could not resolve Kerberos principal name
+ - Causes:
+ - The hostname in the principal cannot be resolved.
+ - The `_HOST` placeholder expands to a hostname unknown to the KDC.
+ - DNS or `/etc/hosts` is misconfigured.
+ - Fix:
+ - Verify the principal spelling.
+ - Ensure all relevant nodes (Doris FE/BE and KDC) have correct
hostname-to-IP entries.
+
+9. Cannot find KDC for realm "XXX"
+ - Cause: The specified realm has no KDC configured in `krb5.conf`.
+ - Fix:
+ - Check the realm name under `[realms]`.
+ - Confirm the `kdc` address.
+ - Restart BE & FE after changing `/etc/krb5.conf`.
+
+10. Request is a replay
+ - Cause: KDC thinks the auth request is duplicated. Typical reasons: clock
skew across nodes or multiple services sharing the same principal.
+ - Fix:
+ - Enable NTP on all nodes to keep time in sync.
+ - Use unique principals per service instance, such as
`service/_HOST@REALM`, to avoid sharing.
+
+11. Client not found in Kerberos database
+ - Cause: The client principal does not exist in the Kerberos database.
+ - Fix: Create the principal in the KDC.
+
+12. Message stream modified (41)
+ - Cause: Known issue for certain OS (e.g., CentOS 7) with Kerberos/Java
combinations.
+ - Fix: Apply vendor patches or security updates.
+
+13. Pre-authentication information was invalid (24)
+ - Causes:
+ - Invalid pre-auth data.
+ - Clock skew between client and KDC.
+ - JDK cipher configuration mismatches the KDC.
+ - Fix:
+ - Sync time across all nodes.
+ - Align cipher configurations.
\ No newline at end of file
diff --git a/docs/lakehouse/metastores/hive-metastore.md
b/docs/lakehouse/metastores/hive-metastore.md
index c9a6cb14bc8..5519bc87409 100644
--- a/docs/lakehouse/metastores/hive-metastore.md
+++ b/docs/lakehouse/metastores/hive-metastore.md
@@ -80,7 +80,7 @@ To use Kerberos authentication to connect to Hive Metastore
service, configure t
When using Hive MetaStore service with Kerberos authentication enabled, ensure
that the same keytab file exists on all FE nodes, the user running the Doris
process has read permission to the keytab file, and the krb5 configuration file
is properly configured.
-For detailed Kerberos configuration, refer to Kerberos Authentication.
+For information on common Kerberos configuration issues and best practices,
please refer to the [Kerberos](../best-practices/kerberos.md)
### Configuration File Parameters
diff --git a/docs/lakehouse/storages/hdfs.md b/docs/lakehouse/storages/hdfs.md
index 5c6444afa44..01b401f01a1 100644
--- a/docs/lakehouse/storages/hdfs.md
+++ b/docs/lakehouse/storages/hdfs.md
@@ -93,6 +93,7 @@ Example:
"hdfs.authentication.kerberos.principal" = "hdfs/[email protected]",
"hdfs.authentication.kerberos.keytab" = "/etc/security/keytabs/hdfs.keytab",
```
+For troubleshooting common Kerberos configuration issues, see [Kerberos
FAQ](../best-practices/kerberos.md/#faq).
## HDFS HA Configuration
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/best-practices/kerberos.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/best-practices/kerberos.md
index 2b0e21b212c..e0406280722 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/best-practices/kerberos.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/best-practices/kerberos.md
@@ -263,3 +263,77 @@ docker inspect <container-name> | grep IPAddress
```
至此,完成多 Kerberos 集群访问配置,您可以查看两个 Hive 集群中的数据,并使用不同的 Kerberos 凭证。
+
+## FAQ
+1. javax.security.sasl.SaslException: No common protection layer between
client and server
+ - 原因: 客户端的 hadoop.rpc.protection 配置与 HDFS 集群上的配置不一致。
+ - 解决: 检查并统一客户端与 HDFS Server 的 hadoop.rpc.protection 配置。
+
+2. No valid credentials provided (Mechanism level: Illegal key size)
+ - 原因: Java 默认不支持大于 128 位的加密密钥长度。
+ - 解决: 下载并安装 Java Cryptography Extension (JCE) Unlimited Strength
Jurisdiction Policy Files。将下载的 JAR 文件解压并放置到 $JAVA_HOME/jre/lib/security
目录下,然后重启相关服务。
+
+3. Encryption type AES256 CTS mode with HMAC SHA1-96 is not supported/enabled
+ - 原因: 当前 Java 环境不支持 AES256 加密,而 Kerberos 默认可能使用此加密方式。
+ - 解决:修改 Kerberos 配置文件 (/etc/krb5.conf),在 [libdefaults] 部分指定一个当前环境支持的加密算法。
或者,安装 JCE 扩展包以启用对 AES256 的支持(同上一个问题)。
+
+4. No valid credentials provided (Mechanism level: Failed to find any Kerberos
tgt)
+ - 原因: Kerberos 无法找到有效的票据授权票据 (Ticket Granting Ticket, TGT)。 对于已正常运行过的环境:
票据(Ticket)已过期或 Kerberos 服务端(KDC)已重启。 对于首次配置的环境: krb5.conf 配置文件有误,或 keytab
文件不正确/已损坏。
+ - 解决: 检查 krb5.conf 和 keytab 文件的正确性,并确保票据在有效期内。可以尝试使用 kinit 命令重新获取票据。
+
+5. Failure unspecified at GSS-API level (Mechanism level: Checksum failed)
+ - 原因: GSS-API 校验和失败。 kinit 时使用的密码错误。 keytab 文件无效或包含过期的密钥版本,导致 JVM
回退到尝试使用用户密码登录。
+ - 解决: 确认 kinit 使用的密码正确,并检查 keytab 文件是否为最新且有效。
+
+6. Receive timed out
+ - 原因: 使用 UDP 协议与 KDC 通信时,网络不稳定或数据包较大,容易导致超时。
+ - 解决: 强制 Kerberos 使用 TCP 协议。在 /etc/krb5.conf 的 [libdefaults] 部分添加以下配置:
+```shell
+ - [libdefaults]
+ udp_preference_limit = 1
+```
+7. javax.security.auth.login.LoginException: Unable to obtain password from
user
+ - 原因: Principal 和 keytab 文件不匹配,或者应用程序无法读取 krb5.conf 或 keytab 文件。
+ - 解决:
+ - 使用 klist -kt <keytab_file> 和 kinit -kt <keytab_file> <principal> 命令验证
keytab 和 principal 是否匹配。
+ - 检查 krb5.conf 和 keytab 文件的路径和文件权限,确保运行程序的用户有权读取它们。
+ - 确认 JVM 启动参数中是否正确指定了配置文件路径。
+
+8. Principal not found 或 Could not resolve Kerberos principal name
+ - 原因:
+ - Principal 名称中的主机名无法被正确解析。
+ - Principal 格式 user/_HOST@REALM 中的 _HOST 占位符被替换为了一个 KDC 无法识别的主机名。
+ - DNS 或 /etc/hosts 文件配置不正确,导致主机名解析失败。
+ - 解决:
+ - 检查 Principal 名称的拼写是否正确。
+ - 确保在所有相关节点(包括 Doris FE、BE 和 KDC)的 /etc/hosts 文件中都包含了正确的主机名和 IP 地址映射。
+
+9. Cannot find KDC for realm "XXX"
+ - 原因: 在 krb5.conf 文件中找不到指定 Realm 的 KDC 配置。
+ - 解决:
+ - 检查 krb5.conf 文件中 [realms] 部分的 Realm 名称是否拼写正确。
+ - 确认该 Realm 下的 kdc 地址是否配置正确。
+ - 如果修改或新增了 /etc/krb5.conf,需要重启 BE&FE 才能使配置生效。
+
+10. Request is a replay
+ - 原因: KDC 认为收到了一个重复的认证请求,这可能是攻击行为。 时间不同步: 集群中各节点(包括 KDC)的时钟不一致。 Principal
共享: 多个服务或进程共享了同一个 Principal(例如 service@REALM),导致认证请求冲突。
+ - 解决:
+ - 在所有节点上配置并启用 NTP 服务,确保时间同步。
+ - 为每个服务实例使用特定的 Principal,格式为 service/_HOST@REALM,避免共享。
+
+11. Client not found in Kerberos database
+ - 原因: 客户端 Principal 在 Kerberos 数据库中不存在。
+ - 解决: 确认使用的 Principal 是否已在 KDC 中正确创建。
+
+12. Message stream modified (41)
+ - 原因: 这通常是特定操作系统(如 CentOS 7)与 Kerberos/Java 组合下的已知问题。
+ - 解决: 联系操作系统供应商或查找相关的安全补丁。
+
+13. Pre-authentication information was invalid (24)
+ - 原因:
+ - 预认证信息无效。
+ - 客户端和 KDC 之间的时钟不同步。
+ - 客户端 JDK 的加密算法与 KDC 不匹配。
+ - 解决:
+ - 检查并同步所有节点的时间。
+ - 检查并统一加密算法配置。
\ No newline at end of file
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/metastores/hive-metastore.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/metastores/hive-metastore.md
index 55b7931c378..824e7d7b867 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/metastores/hive-metastore.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/metastores/hive-metastore.md
@@ -80,7 +80,7 @@
使用开启 Kerberos 认证的 Hive MetaStore 服务时需要确保所有 FE 节点上都存在相同的 keytab 文件,并且运行 Doris
进程的用户具有该 keytab 文件的读权限。以及 krb5 配置文件配置正确。
-Kerberos 详细配置参考 Kerberos 认证。
+Kerberos 配置常见问题及最佳实践请参考 [Kerberos](../best-practices/kerberos.md).
### 配置文件参数
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/storages/hdfs.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/storages/hdfs.md
index eb561af6609..f5098c1f1bb 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/storages/hdfs.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/storages/hdfs.md
@@ -94,6 +94,8 @@ Doris 将以该 `hdfs.authentication.kerberos.principal` 属性指定的主体
"hdfs.authentication.kerberos.keytab" = "/etc/security/keytabs/hdfs.keytab",
```
+Kerberos 配置常见问题请参考 [Kerberos FAQ](../best-practices/kerberos.md/#faq).
+
## 高可用配置(HDFS HA)
如 HDFS 开启了 HA 模式,需要配置 `dfs.nameservices` 相关参数:
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/best-practices/kerberos.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/best-practices/kerberos.md
index 2b0e21b212c..106230cd048 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/best-practices/kerberos.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/best-practices/kerberos.md
@@ -263,3 +263,77 @@ docker inspect <container-name> | grep IPAddress
```
至此,完成多 Kerberos 集群访问配置,您可以查看两个 Hive 集群中的数据,并使用不同的 Kerberos 凭证。
+
+## FAQ
+1. javax.security.sasl.SaslException: No common protection layer between
client and server
+ - 原因: 客户端的 hadoop.rpc.protection 配置与 HDFS 集群上的配置不一致。
+ - 解决: 检查并统一客户端与 HDFS Server 的 hadoop.rpc.protection 配置。
+
+2. No valid credentials provided (Mechanism level: Illegal key size)
+ - 原因: Java 默认不支持大于 128 位的加密密钥长度。
+ - 解决: 下载并安装 Java Cryptography Extension (JCE) Unlimited Strength
Jurisdiction Policy Files。将下载的 JAR 文件解压并放置到 $JAVA_HOME/jre/lib/security
目录下,然后重启相关服务。
+
+3. Encryption type AES256 CTS mode with HMAC SHA1-96 is not supported/enabled
+ - 原因: 当前 Java 环境不支持 AES256 加密,而 Kerberos 默认可能使用此加密方式。
+ - 解决:修改 Kerberos 配置文件 (/etc/krb5.conf),在 [libdefaults] 部分指定一个当前环境支持的加密算法。
或者,安装 JCE 扩展包以启用对 AES256 的支持(同上一个问题)。
+
+4. No valid credentials provided (Mechanism level: Failed to find any Kerberos
tgt)
+ - 原因: Kerberos 无法找到有效的票据授权票据 (Ticket Granting Ticket, TGT)。 对于已正常运行过的环境:
票据(Ticket)已过期或 Kerberos 服务端(KDC)已重启。 对于首次配置的环境: krb5.conf 配置文件有误,或 keytab
文件不正确/已损坏。
+ - 解决: 检查 krb5.conf 和 keytab 文件的正确性,并确保票据在有效期内。可以尝试使用 kinit 命令重新获取票据。
+
+5. Failure unspecified at GSS-API level (Mechanism level: Checksum failed)
+ - 原因: GSS-API 校验和失败。 kinit 时使用的密码错误。 keytab 文件无效或包含过期的密钥版本,导致 JVM
回退到尝试使用用户密码登录。
+ - 解决: 确认 kinit 使用的密码正确,并检查 keytab 文件是否为最新且有效。
+
+6. Receive timed out
+ - 原因: 使用 UDP 协议与 KDC 通信时,网络不稳定或数据包较大,容易导致超时。
+ - 解决: 强制 Kerberos 使用 TCP 协议。在 /etc/krb5.conf 的 [libdefaults] 部分添加以下配置:
+```shell
+ - [libdefaults]
+ udp_preference_limit = 1
+```
+7. javax.security.auth.login.LoginException: Unable to obtain password from
user
+ - 原因: Principal 和 keytab 文件不匹配,或者应用程序无法读取 krb5.conf 或 keytab 文件。
+ - 解决:
+ - 使用 klist -kt <keytab_file> 和 kinit -kt <keytab_file> <principal> 命令验证
keytab 和 principal 是否匹配。
+ - 检查 krb5.conf 和 keytab 文件的路径和文件权限,确保运行程序的用户有权读取它们。
+ - 确认 JVM 启动参数中是否正确指定了配置文件路径。
+
+8. Principal not found 或 Could not resolve Kerberos principal name
+ - 原因:
+ - Principal 名称中的主机名无法被正确解析。
+ - Principal 格式 user/_HOST@REALM 中的 _HOST 占位符被替换为了一个 KDC 无法识别的主机名。
+ - DNS 或 /etc/hosts 文件配置不正确,导致主机名解析失败。
+ - 解决:
+ - 检查 Principal 名称的拼写是否正确。
+ - 确保在所有相关节点(包括 Doris FE、BE 和 KDC)的 /etc/hosts 文件中都包含了正确的主机名和 IP 地址映射。
+
+9. Cannot find KDC for realm "XXX"
+ - 原因: 在 krb5.conf 文件中找不到指定 Realm 的 KDC 配置。
+ - 解决:
+ - 检查 krb5.conf 文件中 [realms] 部分的 Realm 名称是否拼写正确。
+ - 确认该 Realm 下的 kdc 地址是否配置正确。
+ - 如果修改或新增了 /etc/krb5.conf,需要重启 BE&FE 才能使配置生效。
+
+10. Request is a replay
+ - 原因: KDC 认为收到了一个重复的认证请求,这可能是攻击行为。 时间不同步: 集群中各节点(包括 KDC)的时钟不一致。 Principal
共享: 多个服务或进程共享了同一个 Principal(例如 service@REALM),导致认证请求冲突。
+ - 解决:
+ - 在所有节点上配置并启用 NTP 服务,确保时间同步。
+ - 为每个服务实例使用特定的 Principal,格式为 service/_HOST@REALM,避免共享。
+
+11. Client not found in Kerberos database
+ - 原因: 客户端 Principal 在 Kerberos 数据库中不存在。
+ - 解决: 确认使用的 Principal 是否已在 KDC 中正确创建。
+
+12. Message stream modified (41)
+ - 原因: 这通常是特定操作系统(如 CentOS 7)与 Kerberos/Java 组合下的已知问题。
+ - 解决: 联系操作系统供应商或查找相关的安全补丁。
+
+13. Pre-authentication information was invalid (24)
+ - 原因:
+ - 预认证信息无效。
+ - 客户端和 KDC 之间的时钟不同步。
+ - 客户端 JDK 的加密算法与 KDC 不匹配。
+ - 解决:
+ - 检查并同步所有节点的时间。
+ - 检查并统一加密算法配置。
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/metastores/hive-metastore.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/metastores/hive-metastore.md
index 55b7931c378..824e7d7b867 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/metastores/hive-metastore.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/metastores/hive-metastore.md
@@ -80,7 +80,7 @@
使用开启 Kerberos 认证的 Hive MetaStore 服务时需要确保所有 FE 节点上都存在相同的 keytab 文件,并且运行 Doris
进程的用户具有该 keytab 文件的读权限。以及 krb5 配置文件配置正确。
-Kerberos 详细配置参考 Kerberos 认证。
+Kerberos 配置常见问题及最佳实践请参考 [Kerberos](../best-practices/kerberos.md).
### 配置文件参数
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/storages/hdfs.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/storages/hdfs.md
index 932aeeb06d6..23c46bad271 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/storages/hdfs.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/storages/hdfs.md
@@ -94,6 +94,8 @@ Doris 将以该 `hdfs.authentication.kerberos.principal` 属性指定的主体
"hdfs.authentication.kerberos.keytab" = "/etc/security/keytabs/hdfs.keytab",
```
+Kerberos 配置常见问题请参考 [Kerberos FAQ](../best-practices/kerberos.md/#faq).
+
## 高可用配置(HDFS HA)
如 HDFS 开启了 HA 模式,需要配置 `dfs.nameservices` 相关参数:
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/best-practices/kerberos.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/best-practices/kerberos.md
index 2b0e21b212c..106230cd048 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/best-practices/kerberos.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/best-practices/kerberos.md
@@ -263,3 +263,77 @@ docker inspect <container-name> | grep IPAddress
```
至此,完成多 Kerberos 集群访问配置,您可以查看两个 Hive 集群中的数据,并使用不同的 Kerberos 凭证。
+
+## FAQ
+1. javax.security.sasl.SaslException: No common protection layer between
client and server
+ - 原因: 客户端的 hadoop.rpc.protection 配置与 HDFS 集群上的配置不一致。
+ - 解决: 检查并统一客户端与 HDFS Server 的 hadoop.rpc.protection 配置。
+
+2. No valid credentials provided (Mechanism level: Illegal key size)
+ - 原因: Java 默认不支持大于 128 位的加密密钥长度。
+ - 解决: 下载并安装 Java Cryptography Extension (JCE) Unlimited Strength
Jurisdiction Policy Files。将下载的 JAR 文件解压并放置到 $JAVA_HOME/jre/lib/security
目录下,然后重启相关服务。
+
+3. Encryption type AES256 CTS mode with HMAC SHA1-96 is not supported/enabled
+ - 原因: 当前 Java 环境不支持 AES256 加密,而 Kerberos 默认可能使用此加密方式。
+ - 解决:修改 Kerberos 配置文件 (/etc/krb5.conf),在 [libdefaults] 部分指定一个当前环境支持的加密算法。
或者,安装 JCE 扩展包以启用对 AES256 的支持(同上一个问题)。
+
+4. No valid credentials provided (Mechanism level: Failed to find any Kerberos
tgt)
+ - 原因: Kerberos 无法找到有效的票据授权票据 (Ticket Granting Ticket, TGT)。 对于已正常运行过的环境:
票据(Ticket)已过期或 Kerberos 服务端(KDC)已重启。 对于首次配置的环境: krb5.conf 配置文件有误,或 keytab
文件不正确/已损坏。
+ - 解决: 检查 krb5.conf 和 keytab 文件的正确性,并确保票据在有效期内。可以尝试使用 kinit 命令重新获取票据。
+
+5. Failure unspecified at GSS-API level (Mechanism level: Checksum failed)
+ - 原因: GSS-API 校验和失败。 kinit 时使用的密码错误。 keytab 文件无效或包含过期的密钥版本,导致 JVM
回退到尝试使用用户密码登录。
+ - 解决: 确认 kinit 使用的密码正确,并检查 keytab 文件是否为最新且有效。
+
+6. Receive timed out
+ - 原因: 使用 UDP 协议与 KDC 通信时,网络不稳定或数据包较大,容易导致超时。
+ - 解决: 强制 Kerberos 使用 TCP 协议。在 /etc/krb5.conf 的 [libdefaults] 部分添加以下配置:
+```shell
+ - [libdefaults]
+ udp_preference_limit = 1
+```
+7. javax.security.auth.login.LoginException: Unable to obtain password from
user
+ - 原因: Principal 和 keytab 文件不匹配,或者应用程序无法读取 krb5.conf 或 keytab 文件。
+ - 解决:
+ - 使用 klist -kt <keytab_file> 和 kinit -kt <keytab_file> <principal> 命令验证
keytab 和 principal 是否匹配。
+ - 检查 krb5.conf 和 keytab 文件的路径和文件权限,确保运行程序的用户有权读取它们。
+ - 确认 JVM 启动参数中是否正确指定了配置文件路径。
+
+8. Principal not found 或 Could not resolve Kerberos principal name
+ - 原因:
+ - Principal 名称中的主机名无法被正确解析。
+ - Principal 格式 user/_HOST@REALM 中的 _HOST 占位符被替换为了一个 KDC 无法识别的主机名。
+ - DNS 或 /etc/hosts 文件配置不正确,导致主机名解析失败。
+ - 解决:
+ - 检查 Principal 名称的拼写是否正确。
+ - 确保在所有相关节点(包括 Doris FE、BE 和 KDC)的 /etc/hosts 文件中都包含了正确的主机名和 IP 地址映射。
+
+9. Cannot find KDC for realm "XXX"
+ - 原因: 在 krb5.conf 文件中找不到指定 Realm 的 KDC 配置。
+ - 解决:
+ - 检查 krb5.conf 文件中 [realms] 部分的 Realm 名称是否拼写正确。
+ - 确认该 Realm 下的 kdc 地址是否配置正确。
+ - 如果修改或新增了 /etc/krb5.conf,需要重启 BE&FE 才能使配置生效。
+
+10. Request is a replay
+ - 原因: KDC 认为收到了一个重复的认证请求,这可能是攻击行为。 时间不同步: 集群中各节点(包括 KDC)的时钟不一致。 Principal
共享: 多个服务或进程共享了同一个 Principal(例如 service@REALM),导致认证请求冲突。
+ - 解决:
+ - 在所有节点上配置并启用 NTP 服务,确保时间同步。
+ - 为每个服务实例使用特定的 Principal,格式为 service/_HOST@REALM,避免共享。
+
+11. Client not found in Kerberos database
+ - 原因: 客户端 Principal 在 Kerberos 数据库中不存在。
+ - 解决: 确认使用的 Principal 是否已在 KDC 中正确创建。
+
+12. Message stream modified (41)
+ - 原因: 这通常是特定操作系统(如 CentOS 7)与 Kerberos/Java 组合下的已知问题。
+ - 解决: 联系操作系统供应商或查找相关的安全补丁。
+
+13. Pre-authentication information was invalid (24)
+ - 原因:
+ - 预认证信息无效。
+ - 客户端和 KDC 之间的时钟不同步。
+ - 客户端 JDK 的加密算法与 KDC 不匹配。
+ - 解决:
+ - 检查并同步所有节点的时间。
+ - 检查并统一加密算法配置。
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/metastores/hive-metastore.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/metastores/hive-metastore.md
index 55b7931c378..824e7d7b867 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/metastores/hive-metastore.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/metastores/hive-metastore.md
@@ -80,7 +80,7 @@
使用开启 Kerberos 认证的 Hive MetaStore 服务时需要确保所有 FE 节点上都存在相同的 keytab 文件,并且运行 Doris
进程的用户具有该 keytab 文件的读权限。以及 krb5 配置文件配置正确。
-Kerberos 详细配置参考 Kerberos 认证。
+Kerberos 配置常见问题及最佳实践请参考 [Kerberos](../best-practices/kerberos.md).
### 配置文件参数
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/storages/hdfs.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/storages/hdfs.md
index 932aeeb06d6..23c46bad271 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/storages/hdfs.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/storages/hdfs.md
@@ -94,6 +94,8 @@ Doris 将以该 `hdfs.authentication.kerberos.principal` 属性指定的主体
"hdfs.authentication.kerberos.keytab" = "/etc/security/keytabs/hdfs.keytab",
```
+Kerberos 配置常见问题请参考 [Kerberos FAQ](../best-practices/kerberos.md/#faq).
+
## 高可用配置(HDFS HA)
如 HDFS 开启了 HA 模式,需要配置 `dfs.nameservices` 相关参数:
diff --git a/versioned_docs/version-3.x/lakehouse/best-practices/kerberos.md
b/versioned_docs/version-3.x/lakehouse/best-practices/kerberos.md
index 6e2063e10ef..65bebefd859 100644
--- a/versioned_docs/version-3.x/lakehouse/best-practices/kerberos.md
+++ b/versioned_docs/version-3.x/lakehouse/best-practices/kerberos.md
@@ -263,3 +263,78 @@ Or directly use 127.0.0.1 (provided that the service has
been mapped to the host
```
At this point, the multi-Kerberos cluster access configuration is complete.
You can view data from both Hive clusters and use different Kerberos
credentials.
+
+## FAQ
+1. javax.security.sasl.SaslException: No common protection layer between
client and server
+ - Cause: The client's `hadoop.rpc.protection` differs from the HDFS cluster
setting.
+ - Fix: Align `hadoop.rpc.protection` between the client and HDFS server.
+
+2. No valid credentials provided (Mechanism level: Illegal key size)
+ - Cause: Java by default does not support encryption keys larger than 128
bits.
+ - Fix: Install the Java Cryptography Extension (JCE) Unlimited Strength
Policy; unpack the JARs into `$JAVA_HOME/jre/lib/security` and restart services.
+
+3. Encryption type AES256 CTS mode with HMAC SHA1-96 is not supported/enabled
+ - Cause: The current Java environment lacks AES256 support while Kerberos
may use it.
+ - Fix: Update `/etc/krb5.conf` in `[libdefaults]` to use a supported
cipher, or install the JCE extension to enable AES256 (same as above).
+
+4. No valid credentials provided (Mechanism level: Failed to find any Kerberos
tgt)
+ - Cause: Kerberos cannot find a valid Ticket Granting Ticket (TGT). In a
previously working setup, the ticket expired or the KDC restarted. In a new
setup, `krb5.conf` or the keytab is incorrect or corrupted.
+ - Fix: Verify `krb5.conf` and the keytab, ensure tickets are valid, and try
`kinit` to obtain a new ticket.
+
+5. Failure unspecified at GSS-API level (Mechanism level: Checksum failed)
+ - Cause: GSS-API checksum failure; wrong password used with `kinit`; keytab
is invalid or has an outdated key version so JVM falls back to password login.
+ - Fix: Use the correct password with `kinit` and ensure the keytab is
current and valid.
+
+6. Receive timed out
+ - Cause: Using UDP to talk to the KDC on unstable networks or with large
packets.
+ - Fix: Force Kerberos to use TCP by adding in `/etc/krb5.conf`:
+```shell
+[libdefaults]
+udp_preference_limit = 1
+```
+
+7. javax.security.auth.login.LoginException: Unable to obtain password from
user
+ - Cause: Principal does not match the keytab, or the application cannot
read `krb5.conf` or the keytab.
+ - Fix:
+ - Use `klist -kt <keytab_file>` and `kinit -kt <keytab_file>
<principal>` to validate the keytab and principal.
+ - Check paths and permissions for `krb5.conf` and the keytab so the
runtime user can read them.
+ - Ensure JVM startup options specify the correct config paths.
+
+8. Principal not found or Could not resolve Kerberos principal name
+ - Causes:
+ - The hostname in the principal cannot be resolved.
+ - The `_HOST` placeholder expands to a hostname unknown to the KDC.
+ - DNS or `/etc/hosts` is misconfigured.
+ - Fix:
+ - Verify the principal spelling.
+ - Ensure all relevant nodes (Doris FE/BE and KDC) have correct
hostname-to-IP entries.
+
+9. Cannot find KDC for realm "XXX"
+ - Cause: The specified realm has no KDC configured in `krb5.conf`.
+ - Fix:
+ - Check the realm name under `[realms]`.
+ - Confirm the `kdc` address.
+ - Restart BE & FE after changing `/etc/krb5.conf`.
+
+10. Request is a replay
+- Cause: KDC thinks the auth request is duplicated. Typical reasons: clock
skew across nodes or multiple services sharing the same principal.
+- Fix:
+ - Enable NTP on all nodes to keep time in sync.
+ - Use unique principals per service instance, such as
`service/_HOST@REALM`, to avoid sharing.
+
+11. Client not found in Kerberos database
+- Cause: The client principal does not exist in the Kerberos database.
+- Fix: Create the principal in the KDC.
+
+12. Message stream modified (41)
+- Cause: Known issue for certain OS (e.g., CentOS 7) with Kerberos/Java
combinations.
+- Fix: Apply vendor patches or security updates.
+
+13. Pre-authentication information was invalid (24)
+- Causes:
+ - Invalid pre-auth data.
+ - Clock skew between client and KDC.
+ - JDK cipher configuration mismatches the KDC.
+- Fix:
+ - Sync time across all nodes.
+ - Align cipher configurations.
diff --git a/versioned_docs/version-3.x/lakehouse/metastores/hive-metastore.md
b/versioned_docs/version-3.x/lakehouse/metastores/hive-metastore.md
index c9a6cb14bc8..5519bc87409 100644
--- a/versioned_docs/version-3.x/lakehouse/metastores/hive-metastore.md
+++ b/versioned_docs/version-3.x/lakehouse/metastores/hive-metastore.md
@@ -80,7 +80,7 @@ To use Kerberos authentication to connect to Hive Metastore
service, configure t
When using Hive MetaStore service with Kerberos authentication enabled, ensure
that the same keytab file exists on all FE nodes, the user running the Doris
process has read permission to the keytab file, and the krb5 configuration file
is properly configured.
-For detailed Kerberos configuration, refer to Kerberos Authentication.
+For information on common Kerberos configuration issues and best practices,
please refer to the [Kerberos](../best-practices/kerberos.md)
### Configuration File Parameters
diff --git a/versioned_docs/version-3.x/lakehouse/storages/hdfs.md
b/versioned_docs/version-3.x/lakehouse/storages/hdfs.md
index 5c6444afa44..01b401f01a1 100644
--- a/versioned_docs/version-3.x/lakehouse/storages/hdfs.md
+++ b/versioned_docs/version-3.x/lakehouse/storages/hdfs.md
@@ -93,6 +93,7 @@ Example:
"hdfs.authentication.kerberos.principal" = "hdfs/[email protected]",
"hdfs.authentication.kerberos.keytab" = "/etc/security/keytabs/hdfs.keytab",
```
+For troubleshooting common Kerberos configuration issues, see [Kerberos
FAQ](../best-practices/kerberos.md/#faq).
## HDFS HA Configuration
diff --git a/versioned_docs/version-4.x/lakehouse/best-practices/kerberos.md
b/versioned_docs/version-4.x/lakehouse/best-practices/kerberos.md
index 6e2063e10ef..65bebefd859 100644
--- a/versioned_docs/version-4.x/lakehouse/best-practices/kerberos.md
+++ b/versioned_docs/version-4.x/lakehouse/best-practices/kerberos.md
@@ -263,3 +263,78 @@ Or directly use 127.0.0.1 (provided that the service has
been mapped to the host
```
At this point, the multi-Kerberos cluster access configuration is complete.
You can view data from both Hive clusters and use different Kerberos
credentials.
+
+## FAQ
+1. javax.security.sasl.SaslException: No common protection layer between
client and server
+ - Cause: The client's `hadoop.rpc.protection` differs from the HDFS cluster
setting.
+ - Fix: Align `hadoop.rpc.protection` between the client and HDFS server.
+
+2. No valid credentials provided (Mechanism level: Illegal key size)
+ - Cause: Java by default does not support encryption keys larger than 128
bits.
+ - Fix: Install the Java Cryptography Extension (JCE) Unlimited Strength
Policy; unpack the JARs into `$JAVA_HOME/jre/lib/security` and restart services.
+
+3. Encryption type AES256 CTS mode with HMAC SHA1-96 is not supported/enabled
+ - Cause: The current Java environment lacks AES256 support while Kerberos
may use it.
+ - Fix: Update `/etc/krb5.conf` in `[libdefaults]` to use a supported
cipher, or install the JCE extension to enable AES256 (same as above).
+
+4. No valid credentials provided (Mechanism level: Failed to find any Kerberos
tgt)
+ - Cause: Kerberos cannot find a valid Ticket Granting Ticket (TGT). In a
previously working setup, the ticket expired or the KDC restarted. In a new
setup, `krb5.conf` or the keytab is incorrect or corrupted.
+ - Fix: Verify `krb5.conf` and the keytab, ensure tickets are valid, and try
`kinit` to obtain a new ticket.
+
+5. Failure unspecified at GSS-API level (Mechanism level: Checksum failed)
+ - Cause: GSS-API checksum failure; wrong password used with `kinit`; keytab
is invalid or has an outdated key version so JVM falls back to password login.
+ - Fix: Use the correct password with `kinit` and ensure the keytab is
current and valid.
+
+6. Receive timed out
+ - Cause: Using UDP to talk to the KDC on unstable networks or with large
packets.
+ - Fix: Force Kerberos to use TCP by adding in `/etc/krb5.conf`:
+```shell
+[libdefaults]
+udp_preference_limit = 1
+```
+
+7. javax.security.auth.login.LoginException: Unable to obtain password from
user
+ - Cause: Principal does not match the keytab, or the application cannot
read `krb5.conf` or the keytab.
+ - Fix:
+ - Use `klist -kt <keytab_file>` and `kinit -kt <keytab_file>
<principal>` to validate the keytab and principal.
+ - Check paths and permissions for `krb5.conf` and the keytab so the
runtime user can read them.
+ - Ensure JVM startup options specify the correct config paths.
+
+8. Principal not found or Could not resolve Kerberos principal name
+ - Causes:
+ - The hostname in the principal cannot be resolved.
+ - The `_HOST` placeholder expands to a hostname unknown to the KDC.
+ - DNS or `/etc/hosts` is misconfigured.
+ - Fix:
+ - Verify the principal spelling.
+ - Ensure all relevant nodes (Doris FE/BE and KDC) have correct
hostname-to-IP entries.
+
+9. Cannot find KDC for realm "XXX"
+ - Cause: The specified realm has no KDC configured in `krb5.conf`.
+ - Fix:
+ - Check the realm name under `[realms]`.
+ - Confirm the `kdc` address.
+ - Restart BE & FE after changing `/etc/krb5.conf`.
+
+10. Request is a replay
+- Cause: KDC thinks the auth request is duplicated. Typical reasons: clock
skew across nodes or multiple services sharing the same principal.
+- Fix:
+ - Enable NTP on all nodes to keep time in sync.
+ - Use unique principals per service instance, such as
`service/_HOST@REALM`, to avoid sharing.
+
+11. Client not found in Kerberos database
+- Cause: The client principal does not exist in the Kerberos database.
+- Fix: Create the principal in the KDC.
+
+12. Message stream modified (41)
+- Cause: Known issue for certain OS (e.g., CentOS 7) with Kerberos/Java
combinations.
+- Fix: Apply vendor patches or security updates.
+
+13. Pre-authentication information was invalid (24)
+- Causes:
+ - Invalid pre-auth data.
+ - Clock skew between client and KDC.
+ - JDK cipher configuration mismatches the KDC.
+- Fix:
+ - Sync time across all nodes.
+ - Align cipher configurations.
diff --git a/versioned_docs/version-4.x/lakehouse/metastores/hive-metastore.md
b/versioned_docs/version-4.x/lakehouse/metastores/hive-metastore.md
index c9a6cb14bc8..5519bc87409 100644
--- a/versioned_docs/version-4.x/lakehouse/metastores/hive-metastore.md
+++ b/versioned_docs/version-4.x/lakehouse/metastores/hive-metastore.md
@@ -80,7 +80,7 @@ To use Kerberos authentication to connect to Hive Metastore
service, configure t
When using Hive MetaStore service with Kerberos authentication enabled, ensure
that the same keytab file exists on all FE nodes, the user running the Doris
process has read permission to the keytab file, and the krb5 configuration file
is properly configured.
-For detailed Kerberos configuration, refer to Kerberos Authentication.
+For information on common Kerberos configuration issues and best practices,
please refer to the [Kerberos](../best-practices/kerberos.md)
### Configuration File Parameters
diff --git a/versioned_docs/version-4.x/lakehouse/storages/hdfs.md
b/versioned_docs/version-4.x/lakehouse/storages/hdfs.md
index 5c6444afa44..01b401f01a1 100644
--- a/versioned_docs/version-4.x/lakehouse/storages/hdfs.md
+++ b/versioned_docs/version-4.x/lakehouse/storages/hdfs.md
@@ -93,6 +93,7 @@ Example:
"hdfs.authentication.kerberos.principal" = "hdfs/[email protected]",
"hdfs.authentication.kerberos.keytab" = "/etc/security/keytabs/hdfs.keytab",
```
+For troubleshooting common Kerberos configuration issues, see [Kerberos
FAQ](../best-practices/kerberos.md/#faq).
## HDFS HA Configuration
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]