Here is the example I was thinking of: https://reviews.llvm.org/D24013
It didn't yet get submitted, so you can see the current code by looking at GDBRemoteCommunicationClient.cpp lines 381-420. The shortened code is in the above patch. But here they are for reference: Before: // Look for a list of compressions in the features list e.g. // qXfer:features:read+;PacketSize=20000;qEcho+;SupportedCompressions=zlib-deflate,lzma const char *features_list = ::strstr(response_cstr, "qXfer:features:"); if (features_list) { const char *compressions = ::strstr(features_list, "SupportedCompressions="); if (compressions) { std::vector<std::string> supported_compressions; compressions += sizeof("SupportedCompressions=") - 1; const char *end_of_compressions = strchr(compressions, ';'); if (end_of_compressions == NULL) { end_of_compressions = strchr(compressions, '\0'); } const char *current_compression = compressions; while (current_compression < end_of_compressions) { const char *next_compression_name = strchr(current_compression, ','); const char *end_of_this_word = next_compression_name; if (next_compression_name == NULL || end_of_compressions < next_compression_name) { end_of_this_word = end_of_compressions; } if (end_of_this_word) { if (end_of_this_word == current_compression) { current_compression++; } else { std::string this_compression( current_compression, end_of_this_word - current_compression); supported_compressions.push_back(this_compression); current_compression = end_of_this_word + 1; } } else { supported_compressions.push_back(current_compression); current_compression = end_of_compressions; } } if (supported_compressions.size() > 0) { MaybeEnableCompression(supported_compressions); } } } After: std::vector<llvm::StringRef> supported_compressions; llvm::StringRef this_compression; llvm::StringRef compressions = response_str.split("qXfer:features:").second .split("SupportedCompressions=").second .split(";").first; while (!compressions.empty()) { std::tie(this_compression, compressions) = compressions.split(','); supported_compressions.push_back(this_compression); } if (!supported_compressions.empty()) MaybeEnableCompression(supported_compressions); Hopefully we can agree the "After" version is easier to understand :) Not to mention more efficient, since it makes 0 copies, and the original version makes a copy for every compression in the list. In any case, I agree there are times where you don't need all this functionality. I just don't see a reason NOT to use it since it makes better code fall out naturally without even having to try. On Mon, Sep 12, 2016 at 7:11 PM Zachary Turner <ztur...@google.com> wrote: > The StringRef does in fact store a length. So if you write this: > > std::string S("This is a test"); > StringRef R(S); > > Then R internally contains > const char *Ptr = "This is a test" > size_t Len = 14 > > This makes repeated operations on StringRefs really fast (and efficient), > because there's no copying involved. For example, you could say: > > R.consume_front("This"); > R.consume_back("test"); > R = R.trim(); > > And you would end up with R = "is a". In this case, The internal ptr > would refer to the original pointer + 5, and the length would now be 4. > > If you have two StringRefs, R and R2, and you say "if (R == R2)", then > since both of them store their length, then it's a straight memcmp since > both lengths are already known in advance. > > There are lots of useful operations that if you don't have a StringRef, > you won't ever think to use, but once you start using them, you'll never > want to go back. That's why I say we should always prefer StringRef unless > there is a specific reason not to. It doesn't prevent you from doing > anything that you can already do with a const char*, but it allows you to > do much more. The converse is not true. If you vend out const char*'s, > you will end up with a lot of excessively verbose code, because as soon as > someone sees a const char*, they will start looking for the appropriate C > method to use to solve their problem, which is probably going to turn into > a lot of difficult to read / get correct code, when there's a good chance > that there was already a method that did exactly what they wanted in > StringRef and would have saved a ton of code. > > One example that I found recently was there was a string formatted as a ; > separated list. Each semicolon was followed by some prefix indicating what > followed. One of those fields was itself a nested comma separated list. > So it was something like this: > > std::string s = > Field1=...;Field2=...;Field3=value1,value2,value3;Field4=... > > The code to parse this was more than one screen. Converting it to > StringRef and using StringRef::split() it became about 10 lines. > > I agree you don't always "need" it, but you never know how people are > going to use it, and I can't think of a reason *not* to use it. And when > you do need it, it makes code drastically simpler. > > On Mon, Sep 12, 2016 at 6:24 PM Jim Ingham <jing...@apple.com> wrote: > > I see the whole content, but I'll reply to this one so the reply doesn't > get truncated on your end... > > > On Sep 12, 2016, at 6:03 PM, Zachary Turner <ztur...@google.com> wrote: > > > > > > Immediately, very little. A small amount of performance, since > comparing StringRefs is faster than comparing null terminated stings, since > the length is stored with the string it can use memcmp instead of strcmp. > > > > From a big picture perspective, quite a lot IMO. StringRef has numerous > methods which are extremely useful and not easy to reproduce correctly. > Also, If you do the comparison many times, you only pay for the length > computation once. > > > > Also, this email is so long that it's truncating in my email reader. So > I'm not sure if the second part of my email went through. It's possible > your reader is better than mine at handling large emails, but I'm copying > the answer to your second question below just in case you also didn't see > it. > > > > Thanks for the comments. > > > > > > I don't see the benefit of using StringRef's to return all the key > names. I'm generally only ever going to pass them to the StructuredData > API's, which makes them into StringRef's transparently on the way in. How > would building StringRefs help here? > > > > > You also suggested changing a bunch of BreakpointOption API to > return StringRef's. OTOH this CL just mechanically changed from m_options > to m_options_up, so changing the API was not part of the patches intent. > OTOH most of these options (condition, thread name, etc) can take and > return NULL strings. I didn't think StringRef's handled null strings well, > was I wrong about that? And again, what would I gain by making these > StringRef's? I'm just going to stash them away in some storage the object > owns, and I'm not going to ever share the storage with the caller. So > least common denominator (const char *) seems a better choice here. If the > caller wants a StringRef they are welcome to make one. > > > > Right, but making the StringRef incurs a length computation. That's not > something you want to do over and over. It's guaranteed someone somewhere > is going to compute the length, so it's better to do it once upfront, and > allow everyone else to never have to do it again. > > Maybe I don't understand how StringRef's work. I thought they just > wrapped some foreign storage - a string constant, char * or std::string? > So for the length computation to be shared for an object handing out > StringRef's, the object would have to keep both the string and it's > associated StringRef. If the functions just RETURN a StringRef that wraps > a string constant, you'll calculate the length every time. So IIUC: > > static const char *GetSerializationKey() { return "Breakpoint"; } > > becomes: > > static StringRef GetSerializationKey() { static const StringRef > contents("Breakpoint"); return contents; } > > except now this has a non-trivial constructor, so I should really put a > std::once around the initializer, right? That just seems like way more > trouble than it is worth to keep from computing a length a couple of times. > > > > > On the other hand, using a StringRef gives you many advantages. Unless > you know every possible way in which these strings will ever be used, who > knows what someone might want to do with it? What if someone wants to take > one of these strings, check if some other string starts with it, and chop > it off if so? You could either write: > > > > if (strncmp(GetName(), str.c_str(), strlen(GetName()) == 0) > > str2 = str.substr(strlen(GetName())); > > > > which there's a good chance will be either wrong or suboptimal, or you > could write: > > > > str.consume_front(GetName()); > > > > which is both easier to understand, obviously correct, and optimal from > a performance standpoint. > > > > const char* should almost never be used for anything unless you > absolutely must interface with a C-api for which there is no good > equivalent on StringRef (very rare). > > Hum. I would say "If you want to start parsing up a string, put it in a > StringRef, you'll like it..." But if you are handing out a string > constant, "const char *" is fine, and the consumers can dress it up if they > want. > > > > > Since we currently use const char* in many places, this sometimes makies > interfacing difficult, but as more common classes move to StringRef, this > will go away and almost all code will become faster and easier to > understand. > > > > You are right that StringRefs don't handle null strings, but they do > handle empty strings, and it's not common that empty and null need to be > treated as distinct values. > > The classes you reference treat nullptr as "not set". I could go change > that too, but I'd rather write some tests. > > Jim > > >
_______________________________________________ lldb-commits mailing list lldb-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits