[ https://issues.apache.org/jira/browse/HBASE-28812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17880393#comment-17880393 ]
Ke Han commented on HBASE-28812: -------------------------------- [Duo Zhang|https://issues.apache.org/jira/secure/ViewProfile.jspa?name=zhangduo] Thank you for the PR! I have applied the patch to a030e80998, and it's working. > Upgrade from 2.6.0 to 3.0.0 crashed > ----------------------------------- > > Key: HBASE-28812 > URL: https://issues.apache.org/jira/browse/HBASE-28812 > Project: HBase > Issue Type: Bug > Components: compatibility > Affects Versions: 3.0.0 > Reporter: Ke Han > Assignee: Duo Zhang > Priority: Major > Labels: pull-request-available, upgrade > Attachments: hbase--master-2d6e4fad2af5.log, > hbase--master-440ed844e077.log > > > I am trying to upgrade from 2.6.0 (stable release) to 3.0.0. I built 3.0.0 > using the following commit (a030e8099840e640684a68b6e4a79e7c1d5a6823) > {code:java} > commit a030e8099840e640684a68b6e4a79e7c1d5a6823 (HEAD -> branch-3, > upstream/branch-3) > Author: Ray Mattingly <rmdmattin...@gmail.com> > Date: Mon Sep 2 04:38:29 2024 -0400 HBASE-28697 Don't clean bulk load > system entries until backup is complete (#6089) > > Co-authored-by: Ray Mattingly <rmattin...@hubspot.com> > {code} > However, the HMaster would crash during the upgrade process. > h1. Reproduce > Step1: Start up 2.6.0 cluster (1 HDFS, 1 HM, 1 RS) > Step2: Stop the entire cluster > Step3: Upgrade to 3.0.0 cluster. > HMaster will crash with the following error message > {code:java} > 2024-09-04T04:29:18,917 WARN [master/hmaster:16000:becomeActiveMaster] > regionserver.HRegion: Failed initialize of region= > master:store,,1.1595e783b53d99cd5eef43b6debb2682., starting to roll back > memstore > java.io.IOException: java.io.IOException: > org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile > Trailer from file > hdfs://master:8020/hbase/MasterData/data/master/store/1595e783b53d99cd5eef43b6debb2682/info/82c6d244b6244c179cdbafcead00ed75 > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1215) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1158) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:1030) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:974) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7794) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7749) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.region.MasterRegion.open(MasterRegion.java:277) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:432) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:135) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1003) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155) > ~[hbase-common-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at java.lang.Thread.run(Thread.java:833) ~[?:?] > Caused by: java.io.IOException: > org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile > Trailer from file > hdfs://master:8020/hbase/MasterData/data/master/store/1595e783b53d99cd5eef43b6debb2682/info/82c6d244b6244c179cdbafcead00ed75 > at > org.apache.hadoop.hbase.regionserver.StoreEngine.openStoreFiles(StoreEngine.java:289) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.StoreEngine.initialize(StoreEngine.java:339) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HStore.<init>(HStore.java:301) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:6924) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1181) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1178) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[?:?] > at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) > ~[?:?] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) > ~[?:?] > ... 1 more > Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem > reading HFile Trailer from file > hdfs://master:8020/hbase/MasterData/data/master/store/1595e783b53d99cd5eef43b6debb2682/info/82c6d244b6244c179cdbafcead00ed75 > at > org.apache.hadoop.hbase.io.hfile.HFileInfo.initTrailerAndContext(HFileInfo.java:359) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.io.hfile.HFileInfo.<init>(HFileInfo.java:132) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.StoreFileInfo.initHFileInfo(StoreFileInfo.java:763) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HStoreFile.open(HStoreFile.java:395) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HStoreFile.initReader(HStoreFile.java:524) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.StoreEngine.createStoreFileAndReader(StoreEngine.java:226) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.StoreEngine.lambda$openStoreFiles$0(StoreEngine.java:267) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[?:?] > at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) > ~[?:?] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) > ~[?:?] > ... 1 more > Caused by: java.io.IOException: java.lang.ClassNotFoundException: > org.apache.hadoop.hbase.KeyValue$KVComparator > at > org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.getComparatorClass(FixedFileTrailer.java:578) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.deserializeFromPB(FixedFileTrailer.java:304) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.deserialize(FixedFileTrailer.java:250) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:407) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.io.hfile.HFileInfo.initTrailerAndContext(HFileInfo.java:349) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.io.hfile.HFileInfo.<init>(HFileInfo.java:132) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.StoreFileInfo.initHFileInfo(StoreFileInfo.java:763) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HStoreFile.open(HStoreFile.java:395) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HStoreFile.initReader(HStoreFile.java:524) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.StoreEngine.createStoreFileAndReader(StoreEngine.java:226) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.StoreEngine.lambda$openStoreFiles$0(StoreEngine.java:267) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[?:?] > at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) > ~[?:?] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) > ~[?:?] > ... 1 more > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.hbase.KeyValue$KVComparator > at > jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641) > ~[?:?] > at > jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188) > ~[?:?] > at java.lang.ClassLoader.loadClass(ClassLoader.java:520) ~[?:?] > at java.lang.Class.forName0(Native Method) ~[?:?] > at java.lang.Class.forName(Class.java:375) ~[?:?] > at > org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.getComparatorClass(FixedFileTrailer.java:576) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.deserializeFromPB(FixedFileTrailer.java:304) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.deserialize(FixedFileTrailer.java:250) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:407) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.io.hfile.HFileInfo.initTrailerAndContext(HFileInfo.java:349) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.io.hfile.HFileInfo.<init>(HFileInfo.java:132) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.StoreFileInfo.initHFileInfo(StoreFileInfo.java:763) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HStoreFile.open(HStoreFile.java:395) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.HStoreFile.initReader(HStoreFile.java:524) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.StoreEngine.createStoreFileAndReader(StoreEngine.java:226) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at > org.apache.hadoop.hbase.regionserver.StoreEngine.lambda$openStoreFiles$0(StoreEngine.java:267) > ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT] > at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[?:?] > at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) > ~[?:?] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) > ~[?:?] > ... 1 more {code} > This problem seems to be introduced recently, and I can still upgrade from > 2.6.0 to 3.0.0 using the previous commits (E.g. commit from May 24: > 516c89e8597fb6ed391f9e85e594f8b7e5b56e38). > I have attached the hmaster log. -- This message was sent by Atlassian Jira (v8.20.10#820010)