Branch: refs/heads/main
Home: https://github.com/WebKit/WebKit
Commit: 87e6806cfe50025cb8b1176c5ba799bc763e08eb
https://github.com/WebKit/WebKit/commit/87e6806cfe50025cb8b1176c5ba799bc763e08eb
Author: Wenson Hsieh <[email protected]>
Date: 2025-12-02 (Tue, 02 Dec 2025)
Changed paths:
M Source/WebCore/page/text-extraction/TextExtractionTypes.h
M Source/WebKit/Shared/TextExtractionToStringConversion.cpp
M Tools/TestWebKitAPI/Tests/WebKitCocoa/TextExtractionTests.mm
Log Message:
-----------
[AutoFill Debugging] Text extraction can incorrectly omit text from links
https://bugs.webkit.org/show_bug.cgi?id=303355
rdar://165665247
Reviewed by Richard Robinson.
In 302898@main, I introduced logic to avoid surfacing the text inside of a link
separately in the
case where the text is redundant with the link URL. However, it's also possible
for clients to
override the extracted href of a link, in such a way that the text content may
no longer be
redundant.
The existing logic doesn't account for this, and so we end up losing
information about visible text
inside the link altogether, since the original URL is replaced by the client
and we omit the text
content because we think it's already contained within the original URL.
To fix this, we instead check the `TextExtractionAggregator`'s current URL
(which accounts for the
client's overridden attribute).
Test: TextExtractionTests.FilterRedundantTextInLinks
* Source/WebCore/page/text-extraction/TextExtractionTypes.h:
(WebCore::TextExtraction::Item::hasData const):
(WebCore::TextExtraction::Item::dataAs const):
* Source/WebKit/Shared/TextExtractionToStringConversion.cpp:
(WebKit::childTextNodeIsRedundant):
(WebKit::addTextRepresentationRecursive):
* Tools/TestWebKitAPI/Tests/WebKitCocoa/TextExtractionTests.mm:
(TestWebKitAPI::TEST(TextExtractionTests, FilterRedundantTextInLinks)):
Canonical link: https://commits.webkit.org/303756@main
To unsubscribe from these emails, change your notification settings at
https://github.com/WebKit/WebKit/settings/notifications