The StringRef does in fact store a length. So if you write this: std::string S("This is a test"); StringRef R(S);
Then R internally contains const char *Ptr = "This is a test" size_t Len = 14 This makes repeated operations on StringRefs really fast (and efficient), because there's no copying involved. For example, you could say: R.consume_front("This"); R.consume_back("test"); R = R.trim(); And you would end up with R = "is a". In this case, The internal ptr would refer to the original pointer + 5, and the length would now be 4. If you have two StringRefs, R and R2, and you say "if (R == R2)", then since both of them store their length, then it's a straight memcmp since both lengths are already known in advance. There are lots of useful operations that if you don't have a StringRef, you won't ever think to use, but once you start using them, you'll never want to go back. That's why I say we should always prefer StringRef unless there is a specific reason not to. It doesn't prevent you from doing anything that you can already do with a const char*, but it allows you to do much more. The converse is not true. If you vend out const char*'s, you will end up with a lot of excessively verbose code, because as soon as someone sees a const char*, they will start looking for the appropriate C method to use to solve their problem, which is probably going to turn into a lot of difficult to read / get correct code, when there's a good chance that there was already a method that did exactly what they wanted in StringRef and would have saved a ton of code. One example that I found recently was there was a string formatted as a ; separated list. Each semicolon was followed by some prefix indicating what followed. One of those fields was itself a nested comma separated list. So it was something like this: std::string s = Field1=...;Field2=...;Field3=value1,value2,value3;Field4=... The code to parse this was more than one screen. Converting it to StringRef and using StringRef::split() it became about 10 lines. I agree you don't always "need" it, but you never know how people are going to use it, and I can't think of a reason *not* to use it. And when you do need it, it makes code drastically simpler. On Mon, Sep 12, 2016 at 6:24 PM Jim Ingham <jing...@apple.com> wrote: > I see the whole content, but I'll reply to this one so the reply doesn't > get truncated on your end... > > > On Sep 12, 2016, at 6:03 PM, Zachary Turner <ztur...@google.com> wrote: > > > > > > Immediately, very little. A small amount of performance, since > comparing StringRefs is faster than comparing null terminated stings, since > the length is stored with the string it can use memcmp instead of strcmp. > > > > From a big picture perspective, quite a lot IMO. StringRef has numerous > methods which are extremely useful and not easy to reproduce correctly. > Also, If you do the comparison many times, you only pay for the length > computation once. > > > > Also, this email is so long that it's truncating in my email reader. So > I'm not sure if the second part of my email went through. It's possible > your reader is better than mine at handling large emails, but I'm copying > the answer to your second question below just in case you also didn't see > it. > > > > Thanks for the comments. > > > > > > I don't see the benefit of using StringRef's to return all the key > names. I'm generally only ever going to pass them to the StructuredData > API's, which makes them into StringRef's transparently on the way in. How > would building StringRefs help here? > > > > > You also suggested changing a bunch of BreakpointOption API to > return StringRef's. OTOH this CL just mechanically changed from m_options > to m_options_up, so changing the API was not part of the patches intent. > OTOH most of these options (condition, thread name, etc) can take and > return NULL strings. I didn't think StringRef's handled null strings well, > was I wrong about that? And again, what would I gain by making these > StringRef's? I'm just going to stash them away in some storage the object > owns, and I'm not going to ever share the storage with the caller. So > least common denominator (const char *) seems a better choice here. If the > caller wants a StringRef they are welcome to make one. > > > > Right, but making the StringRef incurs a length computation. That's not > something you want to do over and over. It's guaranteed someone somewhere > is going to compute the length, so it's better to do it once upfront, and > allow everyone else to never have to do it again. > > Maybe I don't understand how StringRef's work. I thought they just > wrapped some foreign storage - a string constant, char * or std::string? > So for the length computation to be shared for an object handing out > StringRef's, the object would have to keep both the string and it's > associated StringRef. If the functions just RETURN a StringRef that wraps > a string constant, you'll calculate the length every time. So IIUC: > > static const char *GetSerializationKey() { return "Breakpoint"; } > > becomes: > > static StringRef GetSerializationKey() { static const StringRef > contents("Breakpoint"); return contents; } > > except now this has a non-trivial constructor, so I should really put a > std::once around the initializer, right? That just seems like way more > trouble than it is worth to keep from computing a length a couple of times. > > > > > On the other hand, using a StringRef gives you many advantages. Unless > you know every possible way in which these strings will ever be used, who > knows what someone might want to do with it? What if someone wants to take > one of these strings, check if some other string starts with it, and chop > it off if so? You could either write: > > > > if (strncmp(GetName(), str.c_str(), strlen(GetName()) == 0) > > str2 = str.substr(strlen(GetName())); > > > > which there's a good chance will be either wrong or suboptimal, or you > could write: > > > > str.consume_front(GetName()); > > > > which is both easier to understand, obviously correct, and optimal from > a performance standpoint. > > > > const char* should almost never be used for anything unless you > absolutely must interface with a C-api for which there is no good > equivalent on StringRef (very rare). > > Hum. I would say "If you want to start parsing up a string, put it in a > StringRef, you'll like it..." But if you are handing out a string > constant, "const char *" is fine, and the consumers can dress it up if they > want. > > > > > Since we currently use const char* in many places, this sometimes makies > interfacing difficult, but as more common classes move to StringRef, this > will go away and almost all code will become faster and easier to > understand. > > > > You are right that StringRefs don't handle null strings, but they do > handle empty strings, and it's not common that empty and null need to be > treated as distinct values. > > The classes you reference treat nullptr as "not set". I could go change > that too, but I'd rather write some tests. > > Jim > > >
_______________________________________________ lldb-commits mailing list lldb-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits