Hi folks.
The key to fast use of WTF::String is to avoid creating temporary
WTF::StringImpl objects or temporary copies of string data.
With the latest enhancements to WTF::String, here are the preferred fast ways
to build a new string:
- A single expression with the + operator and arguments of type
WTF::String, char, UChar, const char*, const UChar*, Vector<char>, and
WTF::AtomicString.
- A call to the WTF::makeString function.
- An expression that uses a single function on the string, or uses the +
operator exactly once, or the += operator with the types it supports directly.
- WTF::StringBuilder, in cases where the logic to compute the pieces of the
string has complex branching logic or requires a loop.
Here are acceptable, but not preferred ways to build a new string:
- Building up a Vector<UChar> followed by WTF::String::adopt. I believe
StringBuilder is always better, so we should probably retire this idiom.
Inefficient ways to build a new string include any uses of more than one of the
following:
- WTF::String::append.
- The += operator.
There are other operations that modify the WTF::String; none of those are
efficient if the string in question is then modified further.
- WTF::String::insert.
- WTF::String::replace.
In addition, there are quite a few operations that return a WTF::String, and
none of those are efficient if the string in question is then modified further.
- WTF::String::number.
- WTF::String::substring.
- WTF::String::left.
- WTF::String::right.
- WTF::String::lower.
- WTF::String::upper.
- WTF::String::stripWhiteSpace.
- WTF::String::simplifyWhiteSpace.
- WTF::String::removeCharacters.
- WTF::String::foldCase.
- WTF::String::format.
- WTF::String::fromUTF8.
One reason I bring this up is that if we wanted to make combinations of these
more efficient, we might be able to use techniques similar to those used in
StringOperators.h to make it so the entire result string is built at one time,
eliminating unnecessary copies of the string characters and intermediate
StringImpl objects on the heap.
It would be interesting to find out how often the inefficient idioms are used.
Until recently, there was no significantly better alternative to the
inefficient idioms, so it’s highly likely we have them in multiple places.
A quick grep showed me inefficient uses of += in XMLDocumentParser::handleError
and XPath::FunTranslate::evaluate, parseRFC822HeaderFields,
InspectorStyleSheet::addRule, drawElementTitle in DOMNodeHighlighter.cpp,
WebKitCSSTransformValue::cssText, CSSSelector::selectorText,
CSSPrimitiveValue::cssText, CSSBorderImageValue::cssText, and
CSSParser::createKeyframeRule.
I would not be surprised if at least some of these will show up immediately
with the right kind of performance test. The CSS parsing and serialization
functions seem almost certain to be measurably slow.
I’m looking for two related things:
1) A clean way to find and root out uses of the inefficient idioms that we
can work on together as a team.
2) Some ways to further refine WTF::String so it’s harder to “use it
wrong”. I don’t have any immediate steps in mind, but one possibility would be
to remove functions that are usually part of poorly-performing idioms, pushing
WebKit programmers subtly in the direction of operations that don’t build
intermediate strings.
-- Darin
_______________________________________________
webkit-dev mailing list
[email protected]
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev