dependabot[bot] opened a new pull request, #20650: URL: https://github.com/apache/camel/pull/20650
Bumps [org.jsoup:jsoup](https://github.com/jhy/jsoup) from 1.21.2 to 1.22.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/jhy/jsoup/releases">org.jsoup:jsoup's releases</a>.</em></p> <blockquote> <h2>jsoup Java HTML Parser release 1.22.1</h2> <p><strong>jsoup 1.22.1</strong> is out now, adding support for the <code>re2j</code> regular expression engine for regex-based CSS selectors, a configurable maximum parser depth, and numerous bug fixes and improvements.</p> <p><strong>jsoup</strong> is a Java library for working with real-world HTML and XML. It provides a very convenient API for extracting and manipulating data, using the best of HTML5 DOM methods and CSS selectors.</p> <p><a href="https://github.com/jhy/jsoup/blob/HEAD/download"><strong>Download</strong></a> jsoup now.</p> <h3>Improvements</h3> <ul> <li>Added support for using the <code>re2j</code> regular expression engine for regex-based CSS selectors (e.g. <code>[attr~=regex]</code>, <code>:matches(regex)</code>), which ensures linear-time performance for regex evaluation. This allows safer handling of arbitrary user-supplied query regexes. To enable, add the <code>com.google.re2j</code> dependency to your classpath, e.g.:</li> </ul> <pre lang="xml"><code> <dependency> <groupId>com.google.re2j</groupId> <artifactId>re2j</artifactId> <version>1.8</version> </dependency> </code></pre> <p>(If you already have that dependency in your classpath, but you want to keep using the Java regex engine, you can disable re2j via <code>System.setProperty("jsoup.useRe2j", "false")</code>.) You can confirm that the re2j engine has been enabled correctly by calling <code>Regex.usingRe2j()</code>. <!-- raw HTML omitted --><a href="https://redirect.github.com/jhy/jsoup/pull/2407">#2407</a><!-- raw HTML omitted --></p> <ul> <li>Added an instance method <code>Parser#unescape(String, boolean)</code> that unescapes HTML entities using the parser's configuration (e.g. to support error tracking), complementing the existing static utility <code>Parser.unescapeEntities(String, boolean)</code>. <!-- raw HTML omitted --><a href="https://redirect.github.com/jhy/jsoup/pull/2396">#2396</a><!-- raw HTML omitted --></li> <li>Added a configurable maximum parser depth (to limit the number of open elements on stack) to both HTML and XML parsers. The HTML parser now defaults to a depth of 512 to match browser behavior, and protect against unbounded stack growth, while the XML parser keeps unlimited depth by default, but can opt into a limit via <code>Parser.setMaxDepth()</code>. <!-- raw HTML omitted --><a href="https://redirect.github.com/jhy/jsoup/issues/2421">#2421</a><!-- raw HTML omitted --></li> <li>Build: added CI coverage for JDK 25 <!-- raw HTML omitted --><a href="https://redirect.github.com/jhy/jsoup/pull/2403">#2403</a><!-- raw HTML omitted --></li> <li>Build: added a CI fuzzer for contextual fragment parsing (in addition to existing full body HTML and XML fuzzers). [oss-fuzz <a href="https://redirect.github.com/jhy/jsoup/issues/14041">#14041</a>](<a href="https://redirect.github.com/google/oss-fuzz/pull/14041">google/oss-fuzz#14041</a>)</li> </ul> <h3>Changes</h3> <ul> <li>Set a removal schedule of jsoup 1.24.1 for previously deprecated APIs.</li> </ul> <h3>Bug Fixes</h3> <ul> <li>Previously cached child <code>Elements</code> of an <code>Element</code> were not correctly invalidated in <code>Node#replaceWith(Node)</code>, which could lead to incorrect results when subsequently calling <code>Element#children()</code>. <!-- raw HTML omitted --><a href="https://redirect.github.com/jhy/jsoup/issues/2391">#2391</a><!-- raw HTML omitted --></li> <li>Attribute selector values are now compared literally without trimming. Previously, jsoup trimmed whitespace from selector values and from element attribute values, which could cause mismatches with browser behavior (e.g. <code>[attr=" foo "]</code>). Now matches align with the CSS specification and browser engines. <!-- raw HTML omitted --><a href="https://redirect.github.com/jhy/jsoup/issues/2380">#2380</a><!-- raw HTML omitted --></li> <li>When using the JDK HttpClient, any system default proxy (<code>ProxySelector.getDefault()</code>) was ignored. Now, the system proxy is used if a per-request proxy is not set. <!-- raw HTML omitted --><a href="https://redirect.github.com/jhy/jsoup/issues/2388">#2388</a><!-- raw HTML omitted -->, <!-- raw HTML omitted --><a href="https://redirect.github.com/jhy/jsoup/pull/2390">#2390</a><!-- raw HTML omitted --></li> <li>A <code>ValidationException</code> could be thrown in the adoption agency algorithm with particularly broken input. Now logged as a parse error. <!-- raw HTML omitted --><a href="https://redirect.github.com/jhy/jsoup/issues/2393">#2393</a><!-- raw HTML omitted --></li> <li>Null characters in the HTML body were not consistently removed; and in foreign content were not correctly replaced. <!-- raw HTML omitted --><a href="https://redirect.github.com/jhy/jsoup/issues/2395">#2395</a><!-- raw HTML omitted --></li> <li>An <code>IndexOutOfBoundsException</code> could be thrown when parsing a body fragment with crafted input. Now logged as a parse error. <!-- raw HTML omitted --><a href="https://redirect.github.com/jhy/jsoup/issues/2397">#2397</a><!-- raw HTML omitted -->, <!-- raw HTML omitted --><a href="https://redirect.github.com/jhy/jsoup/issues/2406">#2406</a><!-- raw HTML omitted --></li> <li>When using StructuralEvaluators (e.g., a <code>parent child</code> selector) across many retained threads, their memoized results could also be retained, increasing memory use. These results are now cleared immediately after use, reducing overall memory consumption. <!-- raw HTML omitted --><a href="https://redirect.github.com/jhy/jsoup/issues/2411">#2411</a><!-- raw HTML omitted --></li> <li>Cloning a <code>Parser</code> now preserves any custom <code>TagSet</code> applied to the parser. <!-- raw HTML omitted --><a href="https://redirect.github.com/jhy/jsoup/issues/2422">#2422</a><!-- raw HTML omitted -->, <!-- raw HTML omitted --><a href="https://redirect.github.com/jhy/jsoup/pull/2423">#2423</a><!-- raw HTML omitted --></li> <li>Custom tags marked as <code>Tag.Void</code> now parse and serialize like the built-in void elements: they no longer consume following content, and the XML serializer emits the expected self-closing form. <!-- raw HTML omitted --><a href="https://redirect.github.com/jhy/jsoup/issues/2425">#2425</a><!-- raw HTML omitted --></li> <li>The <code><br></code> element is once again classified as an inline tag (<code>Tag.isBlock() == false</code>), matching common developer expectations and its role as phrasing content in HTML, while pretty-printing and text extraction continue to treat it as a line break in the rendered output. <!-- raw HTML omitted --><a href="https://redirect.github.com/jhy/jsoup/issues/2387">#2387</a><!-- raw HTML omitted -->, <!-- raw HTML omitted --><a href="https://redirect.github.com/jhy/jsoup/issues/2439">#2439</a><!-- raw HTML omitted --></li> <li>Fixed an intermittent truncation issue when fetching and parsing remote documents via <code>Jsoup.connect(url).get()</code>. On responses without a charset header, the initial charset sniff could sometimes (depending on buffering / <code>available()</code> behavior) be mistaken for end-of-stream and a partial parse reused, dropping trailing content. <!-- raw HTML omitted --><a href="https://redirect.github.com/jhy/jsoup/issues/2448">#2448</a><!-- raw HTML omitted --></li> <li><code>TagSet</code> copies no longer mutate their template during lazy lookups, preventing cross-thread <code>ConcurrentModificationException</code> when parsing with shared sessions. <!-- raw HTML omitted --><a href="https://redirect.github.com/jhy/jsoup/pull/2453">#2453</a><!-- raw HTML omitted --></li> <li>Fixed parsing of <code><svg></code> <code>foreignObject</code> content nested within a <code><p></code>, which could incorrectly move the HTML subtree outside the SVG. <!-- raw HTML omitted --><a href="https://redirect.github.com/jhy/jsoup/issues/2452">#2452</a><!-- raw HTML omitted --></li> </ul> <h3>Internal Changes</h3> <ul> <li>Deprecated internal helper <code>org.jsoup.internal.Functions</code> (for removal in v1.23.1). This was previously used to support older Android API levels without full <code>java.util.function</code> coverage; jsoup now requires core library desugaring so this indirection is no longer necessary. <!-- raw HTML omitted --><a href="https://redirect.github.com/jhy/jsoup/pull/2412">#2412</a><!-- raw HTML omitted --></li> </ul> <hr /> <p>My sincere thanks to everyone who contributed to this release! If you have any suggestions for the next release, I would love to hear them; please get in touch via <a href="https://github.com/jhy/jsoup/discussions">jsoup discussions</a>, or with me <a href="https://jhedley.com/">directly</a>.</p> <p>You can also <!-- raw HTML omitted -->follow me<!-- raw HTML omitted --> (<!-- raw HTML omitted --><!-- raw HTML omitted -->@<a href="mailto:[email protected]">[email protected]</a><!-- raw HTML omitted --><!-- raw HTML omitted -->) on Mastodon / Fediverse to receive occasional notes about jsoup releases.</p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/jhy/jsoup/blob/master/CHANGES.md">org.jsoup:jsoup's changelog</a>.</em></p> <blockquote> <h2>1.22.1 (2026-Jan-01)</h2> <h3>Improvements</h3> <ul> <li>Added support for using the <code>re2j</code> regular expression engine for regex-based CSS selectors (e.g. <code>[attr~=regex]</code>, <code>:matches(regex)</code>), which ensures linear-time performance for regex evaluation. This allows safer handling of arbitrary user-supplied query regexes. To enable, add the <code>com.google.re2j</code> dependency to your classpath, e.g.:</li> </ul> <pre lang="xml"><code> <dependency> <groupId>com.google.re2j</groupId> <artifactId>re2j</artifactId> <version>1.8</version> </dependency> </code></pre> <p>(If you already have that dependency in your classpath, but you want to keep using the Java regex engine, you can disable re2j via <code>System.setProperty("jsoup.useRe2j", "false")</code>.) You can confirm that the re2j engine has been enabled correctly by calling <code>org.jsoup.helper.Regex.usingRe2j()</code>. <a href="https://redirect.github.com/jhy/jsoup/pull/2407">#2407</a></p> <ul> <li>Added an instance method <code>Parser#unescape(String, boolean)</code> that unescapes HTML entities using the parser's configuration (e.g. to support error tracking), complementing the existing static utility <code>Parser.unescapeEntities(String, boolean)</code>. <a href="https://redirect.github.com/jhy/jsoup/pull/2396">#2396</a></li> <li>Added a configurable maximum parser depth (to limit the number of open elements on stack) to both HTML and XML parsers. The HTML parser now defaults to a depth of 512 to match browser behavior, and protect against unbounded stack growth, while the XML parser keeps unlimited depth by default, but can opt into a limit via <code>org.jsoup.parser.Parser#setMaxDepth</code>. <a href="https://redirect.github.com/jhy/jsoup/issues/2421">#2421</a></li> <li>Build: added CI coverage for JDK 25 <a href="https://redirect.github.com/jhy/jsoup/pull/2403">#2403</a></li> <li>Build: added a CI fuzzer for contextual fragment parsing (in addition to existing full body HTML and XML fuzzers). [oss-fuzz <a href="https://redirect.github.com/jhy/jsoup/issues/14041">#14041</a>](<a href="https://redirect.github.com/google/oss-fuzz/pull/14041">google/oss-fuzz#14041</a>)</li> </ul> <h3>Changes</h3> <ul> <li>Set a removal schedule of jsoup 1.24.1 for previously deprecated APIs.</li> </ul> <h3>Bug Fixes</h3> <ul> <li>Previously cached child <code>Elements</code> of an <code>Element</code> were not correctly invalidated in <code>Node#replaceWith(Node)</code>, which could lead to incorrect results when subsequently calling <code>Element#children()</code>. <a href="https://redirect.github.com/jhy/jsoup/issues/2391">#2391</a></li> <li>Attribute selector values are now compared literally without trimming. Previously, jsoup trimmed whitespace from selector values and from element attribute values, which could cause mismatches with browser behavior (e.g. <code>[attr=" foo "]</code>). Now matches align with the CSS specification and browser engines. <a href="https://redirect.github.com/jhy/jsoup/issues/2380">#2380</a></li> <li>When using the JDK HttpClient, any system default proxy (<code>ProxySelector.getDefault()</code>) was ignored. Now, the system proxy is used if a per-request proxy is not set. <a href="https://redirect.github.com/jhy/jsoup/issues/2388">#2388</a>, <a href="https://redirect.github.com/jhy/jsoup/pull/2390">#2390</a></li> <li>A <code>ValidationException</code> could be thrown in the adoption agency algorithm with particularly broken input. Now logged as a parse error. <a href="https://redirect.github.com/jhy/jsoup/issues/2393">#2393</a></li> <li>Null characters in the HTML body were not consistently removed; and in foreign content were not correctly replaced. <a href="https://redirect.github.com/jhy/jsoup/issues/2395">#2395</a></li> <li>An <code>IndexOutOfBoundsException</code> could be thrown when parsing a body fragment with crafted input. Now logged as a parse error. <a href="https://redirect.github.com/jhy/jsoup/issues/2397">#2397</a>, <a href="https://redirect.github.com/jhy/jsoup/issues/2406">#2406</a></li> <li>When using StructuralEvaluators (e.g., a <code>parent child</code> selector) across many retained threads, their memoized results could also be retained, increasing memory use. These results are now cleared immediately after use, reducing overall memory consumption. <a href="https://redirect.github.com/jhy/jsoup/issues/2411">#2411</a></li> <li>Cloning a <code>Parser</code> now preserves any custom <code>TagSet</code> applied to the parser. <a href="https://redirect.github.com/jhy/jsoup/issues/2422">#2422</a>, <a href="https://redirect.github.com/jhy/jsoup/pull/2423">#2423</a></li> <li>Custom tags marked as <code>Tag.Void</code> now parse and serialize like the built-in void elements: they no longer consume following content, and the XML serializer emits the expected self-closing form. <a href="https://redirect.github.com/jhy/jsoup/issues/2425">#2425</a></li> <li>The <code><br></code> element is once again classified as an inline tag (<code>Tag.isBlock() == false</code>), matching common developer expectations and its role as phrasing content in HTML, while pretty-printing and text extraction continue to treat it as a line break in the rendered output. <a href="https://redirect.github.com/jhy/jsoup/issues/2387">#2387</a>, <a href="https://redirect.github.com/jhy/jsoup/issues/2439">#2439</a></li> <li>Fixed an intermittent truncation issue when fetching and parsing remote documents via <code>Jsoup.connect(url).get()</code>. On responses without a charset header, the initial charset sniff could sometimes (depending on buffering / <code>available()</code> behavior) be mistaken for end-of-stream and a partial parse reused, dropping trailing content. <a href="https://redirect.github.com/jhy/jsoup/issues/2448">#2448</a></li> <li><code>TagSet</code> copies no longer mutate their template during lazy lookups, preventing cross-thread <code>ConcurrentModificationException</code> when parsing with shared sessions. <a href="https://redirect.github.com/jhy/jsoup/pull/2453">#2453</a></li> <li>Fixed parsing of <code><svg></code> <code>foreignObject</code> content nested within a <code><p></code>, which could incorrectly move the HTML subtree outside the SVG. <a href="https://redirect.github.com/jhy/jsoup/issues/2452">#2452</a></li> </ul> <h3>Internal Changes</h3> <ul> <li>Deprecated internal helper <code>org.jsoup.internal.Functions</code> (for removal in v1.23.1). This was previously used to support older Android API levels without full <code>java.util.function</code> coverage; jsoup now requires core library desugaring so this indirection is no longer necessary. <a href="https://redirect.github.com/jhy/jsoup/pull/2412">#2412</a></li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/jhy/jsoup/commit/8dd66febe8d5e2221a63f4d1228a2a35df81c148"><code>8dd66fe</code></a> [maven-release-plugin] prepare release jsoup-1.22.1</li> <li><a href="https://github.com/jhy/jsoup/commit/d924385d04898121e537dd2b18c4ae3f80afaead"><code>d924385</code></a> Changelog prep for v1.22.1</li> <li><a href="https://github.com/jhy/jsoup/commit/0f3100c7bdeebd06fad30594494b268ce1e31e84"><code>0f3100c</code></a> Bump actions/upload-artifact from 5 to 6 (<a href="https://redirect.github.com/jhy/jsoup/issues/2457">#2457</a>)</li> <li><a href="https://github.com/jhy/jsoup/commit/cf6ac2091b90490ce03ba67270a5f6354be220b4"><code>cf6ac20</code></a> Bump org.apache.maven.plugins:maven-release-plugin from 3.3.0 to 3.3.1 (<a href="https://redirect.github.com/jhy/jsoup/issues/2455">#2455</a>)</li> <li><a href="https://github.com/jhy/jsoup/commit/6bef9383f7d09023675c31acab433c58bc025084"><code>6bef938</code></a> Fix parsing of SVG foreignObject in paragraphs</li> <li><a href="https://github.com/jhy/jsoup/commit/9b1c0fc9e9f094ccca1fc6a2288e8063daab116b"><code>9b1c0fc</code></a> Bump org.apache.maven.plugins:maven-release-plugin from 3.2.0 to 3.3.0 (<a href="https://redirect.github.com/jhy/jsoup/issues/2450">#2450</a>)</li> <li><a href="https://github.com/jhy/jsoup/commit/1415e64f9db9381582616ceaae0b1c63dc1b987f"><code>1415e64</code></a> Bump actions/checkout from 5 to 6 (<a href="https://redirect.github.com/jhy/jsoup/issues/2451">#2451</a>)</li> <li><a href="https://github.com/jhy/jsoup/commit/0e99fd9b2de3a84c5d1e5db5095c312682a90c0c"><code>0e99fd9</code></a> Isolate TagSet copies to prevent shared mutation (<a href="https://redirect.github.com/jhy/jsoup/issues/2453">#2453</a>)</li> <li><a href="https://github.com/jhy/jsoup/commit/90019cb8da2ad8ff59e921f886fd29fc66ec2311"><code>90019cb</code></a> Bump com.github.siom79.japicmp:japicmp-maven-plugin from 0.24.2 to 0.25.0 (<a href="https://redirect.github.com/jhy/jsoup/issues/2">#2</a>...</li> <li><a href="https://github.com/jhy/jsoup/commit/93952695ed0f56bee161acef89dbee7e78914c9a"><code>9395269</code></a> Don't preemptively close</li> <li>Additional commits viewable in <a href="https://github.com/jhy/jsoup/compare/jsoup-1.21.2...jsoup-1.22.1">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
