[ 
https://issues.apache.org/jira/browse/HBASE-28894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28894:
-----------------------------------------
    Description: 
I'm seeing some failures on TestPrefetch.testPrefetchWithDelay in some 
pre-commit runs, I believe this is due to a race condition in PrefetchExecutor.
loadConfiguration.
In these failures, it seems we are getting the NPE below:
{noformat}
Stacktracejava.lang.NullPointerException
        at 
java.util.concurrent.ConcurrentSkipListMap.put(ConcurrentSkipListMap.java:1580)
        at 
org.apache.hadoop.hbase.io.hfile.PrefetchExecutor.request(PrefetchExecutor.java:108)
        at 
org.apache.hadoop.hbase.io.hfile.PrefetchExecutor.lambda$loadConfiguration$0(PrefetchExecutor.java:206)
        at 
java.util.concurrent.ConcurrentSkipListMap.forEach(ConcurrentSkipListMap.java:3269)
        at 
org.apache.hadoop.hbase.io.hfile.PrefetchExecutor.loadConfiguration(PrefetchExecutor.java:200)
        at 
org.apache.hadoop.hbase.regionserver.PrefetchExecutorNotifier.onConfigurationChange(PrefetchExecutorNotifier.java:51)
        at 
org.apache.hadoop.hbase.io.hfile.TestPrefetch.testPrefetchWithDelay(TestPrefetch.java:378)
 {noformat}
I think this is because we are completing prefetch in this test before the 
induced delay, then this test triggers a new configuration change, but the 
prefetch thread calls PrefetchExecutor.complete just before the test thread 
reaches [this 
point|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/PrefetchExecutor.java#L206]:
{noformat}
2024-10-01T11:28:10,660 DEBUG [Time-limited test {}] 
hfile.PrefetchExecutor(102): Prefetch requested for 
/home/jenkins/jenkins-home/workspace/Base-PreCommit-GitHub-PR_PR-6328/yetus-jdk8-hadoop2-check/src/hbase-server/target/test-data/b646497b-7616-6533-e8cb-98e5c9e2e083/TestPrefetchWithDelay/71eefdb271ae4f65b694a6ec3d4287a0,
 delay=25000 ms
2024-10-01T11:28:30,668 INFO  [Time-limited test {}] hbase.Waiter(181): Waiting 
up to [10,000] milli-secs(wait.for.ratio=[1])
2024-10-01T11:28:35,661 DEBUG [hfile-prefetch-1727782088576 {}] 
hfile.HFilePreadReader$1(103): No entry in the backing map for cache key 
71eefdb271ae4f65b694a6ec3d4287a0_0. 
...
2024-10-01T11:28:35,673 DEBUG [hfile-prefetch-1727782088576 {}] 
hfile.HFilePreadReader$1(103): No entry in the backing map for cache key 
71eefdb271ae4f65b694a6ec3d4287a0_52849. 
2024-10-01T11:28:35,674 DEBUG [Time-limited test {}] 
hfile.PrefetchExecutor(142): Prefetch cancelled for 
/home/jenkins/jenkins-home/workspace/Base-PreCommit-GitHub-PR_PR-6328/yetus-jdk8-hadoop2-check/src/hbase-server/target/test-data/b646497b-7616-6533-e8cb-98e5c9e2e083/TestPrefetchWithDelay/71eefdb271ae4f65b694a6ec3d4287a0
2024-10-01T11:28:35,674 DEBUG [hfile-prefetch-1727782088576 {}] 
hfile.PrefetchExecutor(121): Prefetch completed for 
71eefdb271ae4f65b694a6ec3d4287a0
2024-10-01T11:28:35,674 DEBUG [Time-limited test {}] 
hfile.PrefetchExecutor(102): Prefetch requested for 
/home/jenkins/jenkins-home/workspace/Base-PreCommit-GitHub-PR_PR-6328/yetus-jdk8-hadoop2-check/src/hbase-server/target/test-data/b646497b-7616-6533-e8cb-98e5c9e2e083/TestPrefetchWithDelay/71eefdb271ae4f65b694a6ec3d4287a0,
 delay=991 ms
...
{noformat}

  was:
I'm seeing some failures on TestPrefetch.testPrefetchWithDelay in some 
pre-commit runs, I believe this is a flakey. 

In these failures, it seems we are getting the NPE below:
{noformat}
Stacktracejava.lang.NullPointerException
        at 
java.util.concurrent.ConcurrentSkipListMap.put(ConcurrentSkipListMap.java:1580)
        at 
org.apache.hadoop.hbase.io.hfile.PrefetchExecutor.request(PrefetchExecutor.java:108)
        at 
org.apache.hadoop.hbase.io.hfile.PrefetchExecutor.lambda$loadConfiguration$0(PrefetchExecutor.java:206)
        at 
java.util.concurrent.ConcurrentSkipListMap.forEach(ConcurrentSkipListMap.java:3269)
        at 
org.apache.hadoop.hbase.io.hfile.PrefetchExecutor.loadConfiguration(PrefetchExecutor.java:200)
        at 
org.apache.hadoop.hbase.regionserver.PrefetchExecutorNotifier.onConfigurationChange(PrefetchExecutorNotifier.java:51)
        at 
org.apache.hadoop.hbase.io.hfile.TestPrefetch.testPrefetchWithDelay(TestPrefetch.java:378)
 {noformat}

I think this is because we are completing prefetch in this test before the 
induced delay, then this test triggers a new configuration change, but the 
prefetch thread calls PrefetchExecutor.complete just before the test thread 
reaches [this 
point|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/PrefetchExecutor.java#L206]:

{noformat}
2024-10-01T11:28:10,660 DEBUG [Time-limited test {}] 
hfile.PrefetchExecutor(102): Prefetch requested for 
/home/jenkins/jenkins-home/workspace/Base-PreCommit-GitHub-PR_PR-6328/yetus-jdk8-hadoop2-check/src/hbase-server/target/test-data/b646497b-7616-6533-e8cb-98e5c9e2e083/TestPrefetchWithDelay/71eefdb271ae4f65b694a6ec3d4287a0,
 delay=25000 ms
2024-10-01T11:28:30,668 INFO  [Time-limited test {}] hbase.Waiter(181): Waiting 
up to [10,000] milli-secs(wait.for.ratio=[1])
2024-10-01T11:28:35,661 DEBUG [hfile-prefetch-1727782088576 {}] 
hfile.HFilePreadReader$1(103): No entry in the backing map for cache key 
71eefdb271ae4f65b694a6ec3d4287a0_0. 
...
2024-10-01T11:28:35,673 DEBUG [hfile-prefetch-1727782088576 {}] 
hfile.HFilePreadReader$1(103): No entry in the backing map for cache key 
71eefdb271ae4f65b694a6ec3d4287a0_52849. 
2024-10-01T11:28:35,674 DEBUG [Time-limited test {}] 
hfile.PrefetchExecutor(142): Prefetch cancelled for 
/home/jenkins/jenkins-home/workspace/Base-PreCommit-GitHub-PR_PR-6328/yetus-jdk8-hadoop2-check/src/hbase-server/target/test-data/b646497b-7616-6533-e8cb-98e5c9e2e083/TestPrefetchWithDelay/71eefdb271ae4f65b694a6ec3d4287a0
2024-10-01T11:28:35,674 DEBUG [hfile-prefetch-1727782088576 {}] 
hfile.PrefetchExecutor(121): Prefetch completed for 
71eefdb271ae4f65b694a6ec3d4287a0
2024-10-01T11:28:35,674 DEBUG [Time-limited test {}] 
hfile.PrefetchExecutor(102): Prefetch requested for 
/home/jenkins/jenkins-home/workspace/Base-PreCommit-GitHub-PR_PR-6328/yetus-jdk8-hadoop2-check/src/hbase-server/target/test-data/b646497b-7616-6533-e8cb-98e5c9e2e083/TestPrefetchWithDelay/71eefdb271ae4f65b694a6ec3d4287a0,
 delay=991 ms
...
{noformat}


> NPE on TestPrefetch.testPrefetchWithDelay
> -----------------------------------------
>
>                 Key: HBASE-28894
>                 URL: https://issues.apache.org/jira/browse/HBASE-28894
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Wellington Chevreuil
>            Assignee: Wellington Chevreuil
>            Priority: Major
>
> I'm seeing some failures on TestPrefetch.testPrefetchWithDelay in some 
> pre-commit runs, I believe this is due to a race condition in 
> PrefetchExecutor.
> loadConfiguration.
> In these failures, it seems we are getting the NPE below:
> {noformat}
> Stacktracejava.lang.NullPointerException
>       at 
> java.util.concurrent.ConcurrentSkipListMap.put(ConcurrentSkipListMap.java:1580)
>       at 
> org.apache.hadoop.hbase.io.hfile.PrefetchExecutor.request(PrefetchExecutor.java:108)
>       at 
> org.apache.hadoop.hbase.io.hfile.PrefetchExecutor.lambda$loadConfiguration$0(PrefetchExecutor.java:206)
>       at 
> java.util.concurrent.ConcurrentSkipListMap.forEach(ConcurrentSkipListMap.java:3269)
>       at 
> org.apache.hadoop.hbase.io.hfile.PrefetchExecutor.loadConfiguration(PrefetchExecutor.java:200)
>       at 
> org.apache.hadoop.hbase.regionserver.PrefetchExecutorNotifier.onConfigurationChange(PrefetchExecutorNotifier.java:51)
>       at 
> org.apache.hadoop.hbase.io.hfile.TestPrefetch.testPrefetchWithDelay(TestPrefetch.java:378)
>  {noformat}
> I think this is because we are completing prefetch in this test before the 
> induced delay, then this test triggers a new configuration change, but the 
> prefetch thread calls PrefetchExecutor.complete just before the test thread 
> reaches [this 
> point|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/PrefetchExecutor.java#L206]:
> {noformat}
> 2024-10-01T11:28:10,660 DEBUG [Time-limited test {}] 
> hfile.PrefetchExecutor(102): Prefetch requested for 
> /home/jenkins/jenkins-home/workspace/Base-PreCommit-GitHub-PR_PR-6328/yetus-jdk8-hadoop2-check/src/hbase-server/target/test-data/b646497b-7616-6533-e8cb-98e5c9e2e083/TestPrefetchWithDelay/71eefdb271ae4f65b694a6ec3d4287a0,
>  delay=25000 ms
> 2024-10-01T11:28:30,668 INFO  [Time-limited test {}] hbase.Waiter(181): 
> Waiting up to [10,000] milli-secs(wait.for.ratio=[1])
> 2024-10-01T11:28:35,661 DEBUG [hfile-prefetch-1727782088576 {}] 
> hfile.HFilePreadReader$1(103): No entry in the backing map for cache key 
> 71eefdb271ae4f65b694a6ec3d4287a0_0. 
> ...
> 2024-10-01T11:28:35,673 DEBUG [hfile-prefetch-1727782088576 {}] 
> hfile.HFilePreadReader$1(103): No entry in the backing map for cache key 
> 71eefdb271ae4f65b694a6ec3d4287a0_52849. 
> 2024-10-01T11:28:35,674 DEBUG [Time-limited test {}] 
> hfile.PrefetchExecutor(142): Prefetch cancelled for 
> /home/jenkins/jenkins-home/workspace/Base-PreCommit-GitHub-PR_PR-6328/yetus-jdk8-hadoop2-check/src/hbase-server/target/test-data/b646497b-7616-6533-e8cb-98e5c9e2e083/TestPrefetchWithDelay/71eefdb271ae4f65b694a6ec3d4287a0
> 2024-10-01T11:28:35,674 DEBUG [hfile-prefetch-1727782088576 {}] 
> hfile.PrefetchExecutor(121): Prefetch completed for 
> 71eefdb271ae4f65b694a6ec3d4287a0
> 2024-10-01T11:28:35,674 DEBUG [Time-limited test {}] 
> hfile.PrefetchExecutor(102): Prefetch requested for 
> /home/jenkins/jenkins-home/workspace/Base-PreCommit-GitHub-PR_PR-6328/yetus-jdk8-hadoop2-check/src/hbase-server/target/test-data/b646497b-7616-6533-e8cb-98e5c9e2e083/TestPrefetchWithDelay/71eefdb271ae4f65b694a6ec3d4287a0,
>  delay=991 ms
> ...
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to