codelipenghui opened a new pull request, #25362: URL: https://github.com/apache/pulsar/pull/25362
## Motivation The broker-side fallback logic for `SchemaType.JSON` schema validation is too lenient — it accepts **any valid JSON** as a schema definition, not just the legacy Jackson format from the Pulsar 2.0 era. This has caused real issues for non-Java clients (e.g., Rust) where users accidentally register JSON Schema Draft 2020-12 definitions: 1. `StructSchemaDataValidator` accepts it (Avro parse fails → Jackson fallback succeeds) 2. `JsonSchemaCompatibilityCheck` allows it (permissive mixed-format handling) 3. But Java consumers fail with `SchemaParseException: Type not supported: object` because `AvroBaseStructSchema` requires Avro format with no fallback The result is an asymmetry: broker accepts any JSON, consumer requires Avro. Schemas get stored that no Java consumer can read. ## Changes ### New broker configuration - `schemaJsonAllowLegacyJacksonFormat` (boolean, default `false`) ### Modified components (6 source files) - **`ServiceConfiguration`** — new config field - **`StructSchemaDataValidator`** — gates Jackson JsonSchema fallback on config flag; when `false`, Avro `SchemaParseException` propagates directly - **`SchemaDataValidator`** — new `validateSchemaData(data, allowLegacy)` overload - **`SchemaRegistryServiceWithSchemaDataValidator`** — carries and passes config flag - **`JsonSchemaCompatibilityCheck`** — gates mixed-format compatibility on config flag; defense-in-depth rejection when existing schema is not valid Avro - **`SchemaRegistryService`** — wires config from `PulsarService` to validator and compatibility checker ### Client-side (1 file) - **`ProducerImpl`** — deprecation comment on backward-compat code path (no behavioral change) ### Tests (3 test files, +171 lines) - `SchemaDataValidatorTest` — 8 new tests: Avro accepted in both modes, Jackson rejected by default / accepted when enabled, JSON Schema Draft rejected / accepted, arbitrary JSON always rejected, AVRO type unaffected - `JsonSchemaCompatibilityCheckTest` — 4 new tests: legacy enabled allows mixed formats, default rejects mixed, Avro↔Avro unaffected, JSON Schema Draft rejected - `SchemaRegistryServiceWithSchemaDataValidatorTest` — 3 new tests: Jackson rejected by default, accepted when enabled, JSON Schema Draft rejected ## Compatibility This is a **breaking change** in default behavior. Users with legacy pre-2.1 Jackson-format schemas can restore the old behavior by setting `schemaJsonAllowLegacyJacksonFormat=true` in `broker.conf`. Java producers are unaffected (`JSONSchema.of()` generates Avro format since 2.1). Non-Java clients that were incorrectly registering JSON Schema Draft definitions will get a clear error at registration time instead of a confusing consumer-side failure. 🤖 Generated with [Claude Code](https://claude.com/claude-code) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
