geniusjoe opened a new pull request, #25514:
URL: https://github.com/apache/pulsar/pull/25514

   Fixes #18292
   Related #18701
   
   ### Motivation
   
   When multiple requests concurrently create a schema for a brand-new topic 
(i.e., the schema locator z-node does not yet exist), each request first 
creates a BookKeeper ledger via `addNewSchemaEntryToStore`, then attempts to 
create the schema locator z-node via CAS (`createSchemaLocator` with 
`expectedVersion = -1L`). 
   
   Only one request succeeds; the others fail with BadVersionException (because 
ZK MetadataStore translates NODEEXISTS to BadVersionException when 
expectedVersion == -1). However, the ledgers created by the failed requests 
were never cleaned up, resulting in orphan ledgers (dirty data) in BookKeeper.
   
   Note that the existing `updateSchemaLocator` method already has cleanup 
logic for this scenario (deleting the orphan ledger when CAS fails with 
`BadVersionException`) in #18701, but the `createNewSchema` method — which 
handles the **initial** schema creation path — was missing this cleanup.
   
   ### Modifications
   
   - **`BookkeeperSchemaStorage.createNewSchema`**: Added a `whenComplete` 
callback after `createSchemaLocator`. When the CAS operation fails due to 
`AlreadyExistsException` or `BadVersionException`, the orphan BookKeeper ledger 
is asynchronously deleted, consistent with the existing cleanup logic in 
`updateSchemaLocator`. The method is also refactored into three clearly 
commented steps for better readability.
   
   - **`PulsarMockLedgerHandle`**: Added a new constructor that accepts 
`Map<String, byte[]> customMetadata` and passes it to 
`LedgerMetadataBuilder.withCustomMetadata()`, so that mock ledgers can retain 
custom metadata set during creation.
   
   - **`PulsarMockBookKeeper`**: Updated `asyncCreateLedger` to forward the 
`properties` parameter to the new `PulsarMockLedgerHandle` constructor, 
enabling tests to inspect ledger custom metadata.
   
   - **`SchemaTest`**: Added `testConcurrentCreateSchemaNoOrphanLedger` test 
that verifies orphan ledgers are cleaned up when 16 producers concurrently 
create schema on a brand-new topic. The test inspects surviving BK ledgers via 
`PulsarMockBookKeeper.getLedgerMap()` and asserts that only 1 ledger with 
matching `pulsar/schemaId` custom metadata exists.
   
   ### Verifying this change
   
   This change added tests and can be verified as follows:
   
   - Added `testConcurrentCreateSchemaNoOrphanLedger` in `SchemaTest` that 
concurrently creates 16 producers with the same AVRO schema on a brand-new 
topic, then verifies:
     1. Only 1 schema version exists via `admin.schemas().getAllSchemas()`
     2. Only 1 surviving BK ledger has `customMetadata["pulsar/schemaId"]` 
matching the topic's schema name (orphan ledgers from failed concurrent 
creations were deleted)
   
   ### Does this pull request potentially affect one of the following parts:
   
   - [ ] Dependencies (add or upgrade a dependency)
   - [ ] The public API
   - [x] The schema
   - [ ] The default values of configurations
   - [ ] The threading model
   - [ ] The binary protocol
   - [ ] The REST endpoints
   - [ ] The admin CLI options
   - [ ] The metrics
   - [ ] Anything that affects deployment


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to