On Thu, 25 Aug 2022 14:32:06 GMT, Strahinja Stanojevic <d...@openjdk.org> wrote:

> This PR introduces an option to output stable names for the lambda classes in 
> the JDK. A stable name consists of two parts: The first part is the 
> predefined value `$$Lambda$` appended to the lambda capturing class, and the 
> second is a 64-bit hash part of the name. Thus, it looks like 
> `lambdaCapturingClass$$Lambda$hashValue`.
> Parameters used to create a stable hash are a superset of the parameters used 
> for lambda class archiving when the CDS dumping option is enabled. During 
> this process, all the mutual parameters are in the same form as they are in 
> the low-level implementation 
> (`SystemDictionaryShared::add_lambda_proxy_class`) of the archiving process.
> We decided to use a well-specified `CRC32` algorithm from the standard Java 
> library. We created two 32-bit hashes from the parameters used to create 
> stable names. Then, we combined those two 32-bit hashes into one 64-bit hash 
> value.
> We chose `CRC32` because it is a well-specified hash function, and we don't 
> need to write additional code in the JDK. `SHA-256, MD5`, and all other hash 
> functions that rely on `MessageDigest` use lambdas in the implementation, so 
> they are unsuitable for our purpose. We also considered a few different hash 
> functions with a low collision rate. All these functions would require at 
> least 100 lines of additional code in the JDK. The best alternative we found 
> is 64-bit` MurmurHash2`: 
> https://commons.apache.org/proper/commons-codec/jacoco/org.apache.commons.codec.digest/MurmurHash2.java.html.
>   In case adding a new hash implementation (e.g., Murmur2) to the JDK is 
> preferred, this PR can be easily modified.
> We found the post 
> (https://softwareengineering.stackexchange.com/questions/49550/which-hashing-algorithm-is-best-for-uniqueness-and-speed/145633#145633)
>  that compares different hash functions.
> We also tested the `CRC32` hash function against half a billion generated 
> strings, and there were no collisions. Note that the capturing-class name is 
> also part of the lambda class name, so the potential collisions can only 
> appear in a single class. Thus, we do not expect to have name collisions due 
> to a relatively low number of lambdas per class. Every tool that uses this 
> feature should handle potential collisions on its own.  
> We found an overall approximation of the collision rate too. You can find it 
> here: https://preshing.com/20110504/hash-collision-probabilities/.
> 
> JDK currently adds an atomic integer after `$$Lambda$`, and the names of the 
> lambdas depend on the creation order. In the `Test...

This pull request has been closed without being integrated.

-------------

PR: https://git.openjdk.org/jdk/pull/10024

Reply via email to