rstest created HBASE-30219:
------------------------------

             Summary: Admin DDL silently returns success (no 
CreateTableProcedure, no table) when hbase:meta RegionServer is unreachable
                 Key: HBASE-30219
                 URL: https://issues.apache.org/jira/browse/HBASE-30219
             Project: HBase
          Issue Type: Bug
          Components: Admin, Client, proc-v2
    Affects Versions: 2.5.13, 2.6.4
            Reporter: rstest


h1. Summary

clone_table_schema / createTable / truncateTable return success (exitValue=0) 
when the issuing client's node is partitioned from the RegionServer hosting 
hbase:meta, but no CreateTableProcedure is ever registered on the master and 
the table is never created

*Environment*: Three-node cluster (1 master + 2 RegionServers), single HBase 
version, ZooKeeper quorum on all three nodes. A network partition is active 
between the two RegionServers (so a client on the RegionServer that is cut off 
from the meta-hosting RegionServer cannot reach hbase:meta), while the master 
and the ZK quorum remain healthy. Found by differential fault-injection testing 
(the same command sequence run on independent clusters under the same 
partition, with results compared) and reproduced locally with full HMaster logs 
captured.

h2. Description

When a client issues a DDL operation (e.g. {{clone_table_schema}}, {{create}}, 
{{truncate_preserve}}) from a node that cannot reach the RegionServer currently 
hosting {{hbase:meta}}, the operation:

* blocks for ~60 seconds, then
* returns *success* to the caller ({{exitValue = 0}} from the shell; no 
exception from {{HBaseAdmin}}),

even though:

* the HMaster log contains *no CreateTableProcedure* (or 
TruncateTableProcedure) for that table at all - the operation never reached the 
master as a procedure, and
* the table is *never created*: {{exists 'T'}} returns false and {{scan 
'hbase:meta'}} has zero rows for it.

So a DDL call that reports success silently does nothing. There is no error 
surfaced to the client, no entry in {{hbase:meta}}, and no procedure on the 
master - the failure is completely invisible to the application, which believes 
the table now exists.

The decisive condition is *reachability of the meta-hosting RegionServer from 
the client's node*, not the health of the master or ZooKeeper. In our 
reproduction the master and the ZK quorum were both reachable from the client 
throughout; only the path from the client's RegionServer to the RegionServer 
hosting {{hbase:meta}} was cut. A DDL issued from the other side (the node that 
CAN reach the meta host) completes normally in ~1 second with a clean 
{{CreateTableProcedure ... ADD_TO_META -> ASSIGN -> state=ENABLED -> SUCCESS}}.

h2. Evidence (reproduced locally, HMaster logs captured)

In a run where {{hbase:meta}} was hosted on RegionServer R1 and the partition 
cut R1 from R2:

* A {{clone_table_schema}} issued from a client on *R2* (cannot reach R1/meta): 
client returns success after ~60 s; the complete HMaster log over the whole 
window contains *zero* mentions of the target table; {{exists}} = false; 0 rows 
in {{hbase:meta}}. -> silent loss.
* A {{clone_table_schema}} issued from a client on *R1* (meta is local): 
HMaster log shows {{Client=... create 'T'}} -> {{CreateTableProcedure pid=NNN: 
CREATE_TABLE_PRE_OPERATION -> WRITE_FS_LAYOUT -> ADD_TO_META -> ASSIGN -> 
state=ENABLED -> SUCCESS}} in ~1 s; table exists. -> normal.

The two cases differ only in which side of the partition the issuing client 
sits on relative to the meta host. Whichever DDL is issued from the 
meta-unreachable side is silently lost with a success return code; whichever is 
issued from the meta-reachable side succeeds.

h2. What is established vs. the one open detail

Established (reproduced, with logs):
* A DDL call returns success while no procedure is ever registered on the 
master and the table is never created.
* The trigger is the issuing client's node being unable to reach the 
RegionServer hosting {{hbase:meta}} (a RegionServer-to-RegionServer partition), 
with master and ZK quorum healthy.
* The ~60 s duration indicates the client retries an operation that needs meta 
access (unreachable) and ultimately returns success instead of surfacing the 
failure.

Open detail (the only thing not yet pinned): the exact point in the 
client/Admin path that converts the meta-unreachable failure into a success 
return - e.g. the {{clone_table_schema}} source-descriptor read or the 
{{createTable}} pre-flight giving up after retries and returning normally 
rather than throwing. The observable contract violation (success returned, 
nothing created) does not depend on which it is.

h2. Steps to Reproduce (single version, no special build)

# Start a 3-node cluster (node0 = master, node1 + node2 = RegionServers), any 
single version (reproduced on 2.5.13 and 2.6.4).
# Ensure {{hbase:meta}} is hosted on node1 (e.g. {{move}} it there if needed; 
confirm with {{scan 'hbase:meta'}} / the master UI).
# Create a source table to clone: {{create 'S', {NAME => 'cf'}}}.
# Partition node1 <-> node2 only (e.g. {{iptables -A INPUT -s <peer> -j DROP}} 
both directions). Do NOT touch node0, so the master and the ZK quorum stay 
healthy.
# From a client/shell running on *node2* (the side cut off from meta on node1): 
{{clone_table_schema 'S', 'T'}} (or {{create 'T', {NAME => 'cf'}}}).

Observe:
* The command blocks ~60 s, then returns success / {{exitValue = 0}} (no 
exception).
* The HMaster log (node0) contains *no* CreateTableProcedure for {{T}}.
* {{exists 'T'}} => false; {{scan 'hbase:meta'}} has no rows for {{T}}.

Control: issuing the same {{clone_table_schema 'S', 'T2'}} from a client on 
*node1* (meta is local) creates {{T2}} normally in ~1 s. Healing the partition 
does not retroactively create {{T}} - the loss is durable.

Expected: a DDL operation must not return success unless the table was actually 
created and is durably present in {{hbase:meta}}. If it cannot reach the 
metadata it needs, it must fail visibly (timeout/exception), not return success.
Actual: success is returned with no procedure registered and no table created.

h2. Root cause pointers

* Shell DDL is a thin wrapper over {{Admin}}: {{clone_table_schema}} -> 
{{HBaseAdmin.cloneTableSchema}} (reads the source descriptor, then calls 
{{createTable}}); {{create}} -> {{HBaseAdmin.createTableAsync}} -> 
{{MasterRpcServices.createTable}} -> {{HMaster.createTable}} -> 
{{CreateTableProcedure}}; {{truncate_preserve}} -> 
{{HBaseAdmin.truncateTable(name, true)}}.
* The {{createTable}}/{{truncateTable}} futures are specified to wait for the 
table to be enabled and all regions online before returning. In the failing 
case the future returns success though no procedure ran - so the success path 
is reachable without the procedure ever being submitted/acknowledged.
* Suspect area: the client/Admin retry path for the meta-dependent step 
(source-descriptor read or createTable submission) when the meta-hosting 
RegionServer is unreachable - it appears to exhaust retries (~60 s) and return 
normally instead of throwing. A maintainer with the captured HMaster + client 
logs (available on request) can confirm the exact branch.

h2. Suggested fixes

# A DDL Admin call must surface failure (timeout/exception) when the underlying 
meta-dependent step cannot complete, rather than returning success. Returning 
success implies the table is durably created and visible in {{hbase:meta}}.
# Verify the post-condition before returning success: confirm the table exists 
in {{hbase:meta}} (and reached the expected enabled/region state), not merely 
that the local call sequence returned.
# Add a fault-injection regression test: 3-node cluster, partition a 
RegionServer from the meta-hosting RegionServer, issue create/clone/truncate 
from the cut-off side, and assert the call FAILS (or the table is durably 
created) - it must never return success with no table created.

h2. Additional context

* The bug requires no version change and no special configuration - it 
reproduces on a single-version cluster (Steps to Reproduce above). It was 
originally surfaced by a differential test harness that compares independent 
clusters running the same plan under the same partition; because the partition 
silently dropped DDL from one side, the clusters ended with divergent table 
sets (one cluster missing table A, another missing table B, a truncated table's 
enabled state differing) - all explained by the single rule above: a DDL from 
the meta-unreachable side is silently lost with a success return code.
* We can provide the captured per-lane HMaster logs (meta-unreachable side: 
zero mentions of the table; meta-reachable side: clean CreateTableProcedure 
SUCCESS), the RegionServer log showing the meta location, and the client-side 
timing showing the ~60 s success return.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to