sollhui opened a new pull request, #60953:
URL: https://github.com/apache/doris/pull/60953
## Summary
When using majority write (quorum success), BE does not distinguish between
replicas
with continuous versions and replicas with version gaps (`lastFailedVersion
>= 0`).
This causes inconsistency with FE's commit check, which correctly excludes
version-gap replicas from success counting.
## Bad Case
Consider 3 replicas on nodes 1, 2, 3 with `load_required_replica_num = 2`:
1. **First write**: nodes 1,2 succeed, node 3 fails → overall success.
Node 3 now has a version gap (`lastFailedVersion >= 0`).
2. **Second write**: nodes 1,3 succeed, node 2 fails →
- **BE** counts 2 successes (nodes 1,3), considers it quorum success.
- **FE** commit only counts node 1 as success (node 3 has version gap),
so `successReplicaNum = 1 < 2`, commit fails.
- This wastes resources since BE already returned success to the client
but FE rejects the transaction.
The correct behavior for the second write:
- nodes 1,3 succeed → should **FAIL** (node 3 has version gap, only node 1
counts)
- nodes 1,2 succeed → should **SUCCEED** (both have continuous versions)
## Solution
Pass per-tablet version-gap backend information from FE to BE via a new
thrift field
`map<tablet_id, list<backend_id>> tablet_version_gap_backends` in
`TOlapTablePartition`.
On the BE side, when counting successful replicas for majority write in both
`VTabletWriter` (V1) and `VTabletWriterV2`, exclude version-gap backends from
the `finished_tablets_replica` counter. This makes BE's quorum check
consistent
with FE's commit check.
### Changes
- **Descriptors.thrift**: Add `tablet_version_gap_backends` field to
`TOlapTablePartition`
- **OlapTable.java**: Add `getPartitionVersionGapBackends()` to compute gap
backends per tablet
- **OlapTableSink.java**: Populate the new field when building partition info
- **tablet_info.h/cpp**: Parse and store gap backends from thrift
- **vtablet_writer.cpp**: Exclude gap backends in `_quorum_success`
- **vtablet_writer_v2.cpp**: Exclude gap backends in `_quorum_success` and
`_create_commit_info`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]