sollhui opened a new pull request, #60953:
URL: https://github.com/apache/doris/pull/60953

   ## Summary
   
   When using majority write (quorum success), BE does not distinguish between 
replicas
   with continuous versions and replicas with version gaps (`lastFailedVersion 
>= 0`).
   This causes inconsistency with FE's commit check, which correctly excludes
   version-gap replicas from success counting.
   
   ## Bad Case
   
   Consider 3 replicas on nodes 1, 2, 3 with `load_required_replica_num = 2`:
   
   1. **First write**: nodes 1,2 succeed, node 3 fails → overall success.
      Node 3 now has a version gap (`lastFailedVersion >= 0`).
   2. **Second write**: nodes 1,3 succeed, node 2 fails →
      - **BE** counts 2 successes (nodes 1,3), considers it quorum success.
      - **FE** commit only counts node 1 as success (node 3 has version gap),
        so `successReplicaNum = 1 < 2`, commit fails.
      - This wastes resources since BE already returned success to the client
        but FE rejects the transaction.
   
   The correct behavior for the second write:
   - nodes 1,3 succeed → should **FAIL** (node 3 has version gap, only node 1 
counts)
   - nodes 1,2 succeed → should **SUCCEED** (both have continuous versions)
   
   ## Solution
   
   Pass per-tablet version-gap backend information from FE to BE via a new 
thrift field
   `map<tablet_id, list<backend_id>> tablet_version_gap_backends` in 
`TOlapTablePartition`.
   
   On the BE side, when counting successful replicas for majority write in both
   `VTabletWriter` (V1) and `VTabletWriterV2`, exclude version-gap backends from
   the `finished_tablets_replica` counter. This makes BE's quorum check 
consistent
   with FE's commit check.
   
   ### Changes
   
   - **Descriptors.thrift**: Add `tablet_version_gap_backends` field to 
`TOlapTablePartition`
   - **OlapTable.java**: Add `getPartitionVersionGapBackends()` to compute gap 
backends per tablet
   - **OlapTableSink.java**: Populate the new field when building partition info
   - **tablet_info.h/cpp**: Parse and store gap backends from thrift
   - **vtablet_writer.cpp**: Exclude gap backends in `_quorum_success`
   - **vtablet_writer_v2.cpp**: Exclude gap backends in `_quorum_success` and 
`_create_commit_info`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to