codelipenghui opened a new pull request, #25362:
URL: https://github.com/apache/pulsar/pull/25362

   ## Motivation
   
   The broker-side fallback logic for `SchemaType.JSON` schema validation is 
too lenient — it accepts **any valid JSON** as a schema definition, not just 
the legacy Jackson format from the Pulsar 2.0 era. This has caused real issues 
for non-Java clients (e.g., Rust) where users accidentally register JSON Schema 
Draft 2020-12 definitions:
   
   1. `StructSchemaDataValidator` accepts it (Avro parse fails → Jackson 
fallback succeeds)
   2. `JsonSchemaCompatibilityCheck` allows it (permissive mixed-format 
handling)
   3. But Java consumers fail with `SchemaParseException: Type not supported: 
object` because `AvroBaseStructSchema` requires Avro format with no fallback
   
   The result is an asymmetry: broker accepts any JSON, consumer requires Avro. 
Schemas get stored that no Java consumer can read.
   
   ## Changes
   
   ### New broker configuration
   - `schemaJsonAllowLegacyJacksonFormat` (boolean, default `false`)
   
   ### Modified components (6 source files)
   - **`ServiceConfiguration`** — new config field
   - **`StructSchemaDataValidator`** — gates Jackson JsonSchema fallback on 
config flag; when `false`, Avro `SchemaParseException` propagates directly
   - **`SchemaDataValidator`** — new `validateSchemaData(data, allowLegacy)` 
overload
   - **`SchemaRegistryServiceWithSchemaDataValidator`** — carries and passes 
config flag
   - **`JsonSchemaCompatibilityCheck`** — gates mixed-format compatibility on 
config flag; defense-in-depth rejection when existing schema is not valid Avro
   - **`SchemaRegistryService`** — wires config from `PulsarService` to 
validator and compatibility checker
   
   ### Client-side (1 file)
   - **`ProducerImpl`** — deprecation comment on backward-compat code path (no 
behavioral change)
   
   ### Tests (3 test files, +171 lines)
   - `SchemaDataValidatorTest` — 8 new tests: Avro accepted in both modes, 
Jackson rejected by default / accepted when enabled, JSON Schema Draft rejected 
/ accepted, arbitrary JSON always rejected, AVRO type unaffected
   - `JsonSchemaCompatibilityCheckTest` — 4 new tests: legacy enabled allows 
mixed formats, default rejects mixed, Avro↔Avro unaffected, JSON Schema Draft 
rejected
   - `SchemaRegistryServiceWithSchemaDataValidatorTest` — 3 new tests: Jackson 
rejected by default, accepted when enabled, JSON Schema Draft rejected
   
   ## Compatibility
   
   This is a **breaking change** in default behavior. Users with legacy pre-2.1 
Jackson-format schemas can restore the old behavior by setting 
`schemaJsonAllowLegacyJacksonFormat=true` in `broker.conf`.
   
   Java producers are unaffected (`JSONSchema.of()` generates Avro format since 
2.1). Non-Java clients that were incorrectly registering JSON Schema Draft 
definitions will get a clear error at registration time instead of a confusing 
consumer-side failure.
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to