[webkit-dev] Slow idioms with WTF::String

Darin Adler Tue, 12 Jul 2011 10:25:31 -0700

Hi folks.

The key to fast use of WTF::String is to avoid creating temporary 
WTF::StringImpl objects or temporary copies of string data.


With the latest enhancements to WTF::String, here are the preferred fast ways 
to build a new string:

    - A single expression with the + operator and arguments of type 
WTF::String, char, UChar, const char*, const UChar*, Vector<char>, and 
WTF::AtomicString.
    - A call to the WTF::makeString function.
    - An expression that uses a single function on the string, or uses the + 
operator exactly once, or the += operator with the types it supports directly.
    - WTF::StringBuilder, in cases where the logic to compute the pieces of the 
string has complex branching logic or requires a loop.

Here are acceptable, but not preferred ways to build a new string:

    - Building up a Vector<UChar> followed by WTF::String::adopt. I believe 
StringBuilder is always better, so we should probably retire this idiom.

Inefficient ways to build a new string include any uses of more than one of the 
following:

    - WTF::String::append.
    - The += operator.

There are other operations that modify the WTF::String; none of those are 
efficient if the string in question is then modified further.

    - WTF::String::insert.
    - WTF::String::replace.

In addition, there are quite a few operations that return a WTF::String, and 
none of those are efficient if the string in question is then modified further.

    - WTF::String::number.
    - WTF::String::substring.
    - WTF::String::left.
    - WTF::String::right.
    - WTF::String::lower.
    - WTF::String::upper.
    - WTF::String::stripWhiteSpace.
    - WTF::String::simplifyWhiteSpace.
    - WTF::String::removeCharacters.
    - WTF::String::foldCase.
    - WTF::String::format.
    - WTF::String::fromUTF8.

One reason I bring this up is that if we wanted to make combinations of these 
more efficient, we might be able to use techniques similar to those used in 
StringOperators.h to make it so the entire result string is built at one time, 
eliminating unnecessary copies of the string characters and intermediate 
StringImpl objects on the heap.

It would be interesting to find out how often the inefficient idioms are used. 
Until recently, there was no significantly better alternative to the 
inefficient idioms, so it’s highly likely we have them in multiple places.

A quick grep showed me inefficient uses of += in XMLDocumentParser::handleError 
and XPath::FunTranslate::evaluate, parseRFC822HeaderFields, 
InspectorStyleSheet::addRule, drawElementTitle in DOMNodeHighlighter.cpp, 
WebKitCSSTransformValue::cssText, CSSSelector::selectorText, 
CSSPrimitiveValue::cssText, CSSBorderImageValue::cssText, and 
CSSParser::createKeyframeRule.

I would not be surprised if at least some of these will show up immediately 
with the right kind of performance test. The CSS parsing and serialization 
functions seem almost certain to be measurably slow.

I’m looking for two related things:

    1) A clean way to find and root out uses of the inefficient idioms that we 
can work on together as a team.

     2) Some ways to further refine WTF::String so it’s harder to “use it 
wrong”. I don’t have any immediate steps in mind, but one possibility would be 
to remove functions that are usually part of poorly-performing idioms, pushing 
WebKit programmers subtly in the direction of operations that don’t build 
intermediate strings.

    -- Darin

_______________________________________________
webkit-dev mailing list
[email protected]
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev

[webkit-dev] Slow idioms with WTF::String

Reply via email to