lhotari opened a new pull request, #25570:
URL: https://github.com/apache/pulsar/pull/25570

   ### Motivation
   
   `PackageMetadataUtil.fromBytes` used 
`org.apache.commons.lang3.SerializationUtils.deserialize`, a thin wrapper over 
`ObjectInputStream.readObject`. `readObject` instantiates arbitrary classes 
from the stream **before** the post-read `instanceof PackageMetadata` check 
runs, so any gadget chain on the classpath would fire before validation. This 
was the only production Java-deserialization surface in Pulsar 
(`SerializationUtils.deserialize` had exactly one hit; the remaining 
`ObjectInputStream`/`readObject` references are `BitSetRecyclable`'s in-memory 
recycler and the BouncyCastle PEM reader, neither a Java-serialization surface).
   
   #### Severity — not externally reachable
   
   The deserialization path is **not reachable from external attackers**. The 
REST entry points (`pulsar-broker/.../admin/v3/Packages.java`) accept JSON via 
Jackson from authenticated admins; Pulsar then re-serializes with Java 
serialization to store in BookKeeper / filesystem. The Java-serialized blob 
only exists at the storage layer, written by Pulsar itself from 
already-validated JSON. Exploiting the gadget chain requires either write 
access to the backing package storage, or a separate future bug that feeds raw 
bytes into `fromBytes`.
   
   This PR is best understood as **defense in depth plus eliminating an 
unnecessary dangerous primitive**, not a live RCE fix.
   
   ### Modifications
   
   **Write path** — JSON via the existing `ObjectMapperFactory` (same factory 
used across the codebase for admin/REST serialization).
   
   **Read path** — auto-detects format per entry:
   - Leading `{` → JSON (always safe).
   - Java stream magic `0xAC 0xED` → legacy deserialization, gated by config 
and wrapped in a strict JEP 290 `ObjectInputFilter`. The filter allowlist is 
narrow: `PackageMetadata`, 
`HashMap`/`LinkedHashMap`/`Map`/`Map.Entry`/`AbstractMap`, 
`String`/`Number`/`Long`/`Boolean`, `Object`, plus resource limits 
(`maxdepth=5`, `maxrefs=100`, `maxbytes=1MB`, `maxarray=1024`) and `!*` denying 
everything else. Class descriptors are filtered *before* instantiation, so 
gadget chains fail immediately.
   - Anything else → `MetadataFormatException` with a message naming the config 
flag to flip.
   
   **Two new broker config flags (both default `true`, under 
`CATEGORY_PACKAGES_MANAGEMENT`)**:
   - `packagesManagementJsonSerializationEnabled` — write format. Rollback 
switch; legacy format will be removed in a future release.
   - `packagesManagementAllowLegacyJavaSerialization` — read acceptance. The 
filter is always applied when the legacy path runs. This default is scheduled 
to flip to `false` in a future release.
   
   The two flags allow a phased rollout: defaults ship the fix active; 
operators can flip either independently (rollback writes, or tighten reads 
after migration).
   
   On broker startup the new ctor in `PackagesManagementImpl` emits `WARN`s 
describing the active combination (migration window, rollback mode, or the 
misconfigured "new writes use Java but reads reject it" state).
   
   ### Verifying this change
   
   This change added tests and can be verified as follows:
   
   - `./gradlew :pulsar-package-management:pulsar-package-core:test --tests 
'PackageMetadataSerdeTest' --tests 'PackagesManagementImplTest'` — all pass.
   - `./gradlew :pulsar-broker-common:compileJava :pulsar-broker:compileJava` — 
passes.
   - Static audit: `rg 'SerializationUtils\.(de)?serialize|ObjectInputStream' 
pulsar-package-management/ --glob '*.java'` → production hits are confined to 
`PackageMetadataUtil.java` (both paths gated by flags); remaining hits are the 
filter-rejection test fixture.
   
   New tests in `PackageMetadataSerdeTest`:
   - `testJsonRoundTrip` — confirms bytes start with `{`.
   - `testLegacyRoundTrip` — confirms bytes start with `0xAC 0xED`.
   - `testJsonReadableWhenLegacyDisabled` — JSON always works.
   - `testLegacyRejectedWhenLegacyDisabled` — rejection names the config flag.
   - `testFilterRejectsUnsafeClass` — crafts a valid Java stream with a 
disallowed class (`ArrayList`); filter rejects **before** instantiation, not 
via the post-read `instanceof`.
   - `testUnknownFormatRejected`, `testEmptyRejected`.
   
   ### Does this pull request potentially affect one of the following parts:
   
   - [x] The public API — `PackageMetadataUtil.fromBytes(byte[])` and 
`toBytes(PackageMetadata)` remain as `@Deprecated` overloads delegating to the 
new two-arg forms; source-compatible for external callers.
   - [x] The default values of configurations — two new configs added (both 
default `true`; one scheduled to flip to `false` in a future release as 
documented in its `@FieldContext` doc).
   
   ### Documentation
   
   - [x] `doc-not-needed`
   
   (The config docs in `ServiceConfiguration.java` carry the operator-facing 
migration notes; a separate release-notes entry will cover the scheduled 
default flip.)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to