Wang-Benjamin opened a new pull request, #8009:
URL: https://github.com/apache/hadoop/pull/8009
## Problem
There could be a race condition in generating namenode ID. What is
problematic is that different URIs resolve to the same namenode ID, which leads
to data integrity issues and security problems. The logic of the underlying
function used to detect such problem is incorrect as it throws an error
whenever the namenode exists even if the URI is the same. As a result,
`testViewFsMultipleExportPoint()` has a nondeterministic behavior when the test
execution order and timing change.
Command line used to identify the nondeterminism of
`testViewFsMultipleExportPoint()`:
```
mvn -pl hadoop-hdfs-project/hadoop-hdfs-nfs \
-Dcheckstyle.skip=true -Drat.skip=true \
edu.illinois:nondex-maven-plugin:2.1.7:nondex \
-Dtest=org.apache.hadoop.hdfs.nfs.nfs3.TestExportsTable#testViewFsMultipleExportPoint
\
-DnondexRuns=20 |& tee nondex-$(date +%s).log
```
Error example:
```
[ERROR] TestExportsTable.testViewFsMultipleExportPoint:111 ยป FileSystem
FS:viewfs, Namenode ID collision for path:/hdfs2 nnid:2130740544 uri being
added:hdfs://localhost:34111/ existing uri:hdfs://localhost:34111/
```
## Patch
The proposed fix has changed the collision detection logic in
`prepareAddressMap()` so that it will throw errors only when different URIs
resolves to the same namenode ID.
In addition, the namenode ID generation logic in `getNamenodeId()` is
modified where a robust hash function combines URI and address strings, so it
now reduces the chance of hash collision.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]