Re: [Lldb-commits] [lldb] r281273 - This is the main part of a change to add breakpoint save and restore to lldb.

Zachary Turner via lldb-commits Mon, 12 Sep 2016 20:00:33 -0700

Here is the example I was thinking of:

https://reviews.llvm.org/D24013


It didn't yet get submitted, so you can see the current code by looking at
GDBRemoteCommunicationClient.cpp lines 381-420.  The shortened code is in
the above patch.  But here they are for reference:

Before:
    // Look for a list of compressions in the features list e.g.
    //
qXfer:features:read+;PacketSize=20000;qEcho+;SupportedCompressions=zlib-deflate,lzma
    const char *features_list = ::strstr(response_cstr, "qXfer:features:");
    if (features_list) {
      const char *compressions =
          ::strstr(features_list, "SupportedCompressions=");
      if (compressions) {
        std::vector<std::string> supported_compressions;
        compressions += sizeof("SupportedCompressions=") - 1;
        const char *end_of_compressions = strchr(compressions, ';');
        if (end_of_compressions == NULL) {
          end_of_compressions = strchr(compressions, '\0');
        }
        const char *current_compression = compressions;
        while (current_compression < end_of_compressions) {
          const char *next_compression_name = strchr(current_compression,
',');
          const char *end_of_this_word = next_compression_name;
          if (next_compression_name == NULL ||
              end_of_compressions < next_compression_name) {
            end_of_this_word = end_of_compressions;
          }

          if (end_of_this_word) {
            if (end_of_this_word == current_compression) {
              current_compression++;
            } else {
              std::string this_compression(
                  current_compression, end_of_this_word -
current_compression);
              supported_compressions.push_back(this_compression);
              current_compression = end_of_this_word + 1;
            }
          } else {
            supported_compressions.push_back(current_compression);
            current_compression = end_of_compressions;
          }
        }

        if (supported_compressions.size() > 0) {
          MaybeEnableCompression(supported_compressions);
        }
      }
    }

After:

        std::vector<llvm::StringRef> supported_compressions;
        llvm::StringRef this_compression;
        llvm::StringRef compressions =
            response_str.split("qXfer:features:").second
                        .split("SupportedCompressions=").second
                        .split(";").first;
        while (!compressions.empty())
        {
            std::tie(this_compression, compressions) =
compressions.split(',');
            supported_compressions.push_back(this_compression);
        }
        if (!supported_compressions.empty())
            MaybeEnableCompression(supported_compressions);

Hopefully we can agree the "After" version is easier to understand :)  Not
to mention more efficient, since it makes 0 copies, and the original
version makes a copy for every compression in the list.

In any case, I agree there are times where you don't need all this
functionality.  I just don't see a reason NOT to use it since it makes
better code fall out naturally without even having to try.

On Mon, Sep 12, 2016 at 7:11 PM Zachary Turner <ztur...@google.com> wrote:

> The StringRef does in fact store a length.  So if you write this:
>
> std::string S("This is a test");
> StringRef R(S);
>
> Then R internally contains
> const char *Ptr = "This is a test"
> size_t Len = 14
>
> This makes repeated operations on StringRefs really fast (and efficient),
> because there's no copying involved.  For example, you could say:
>
> R.consume_front("This");
> R.consume_back("test");
> R = R.trim();
>
> And you would end up with R = "is a".  In this case, The internal ptr
> would refer to the original pointer + 5, and the length would now be 4.
>
> If you have two StringRefs, R and R2, and you say "if (R == R2)", then
> since both of them store their length, then it's a straight memcmp since
> both lengths are already known in advance.
>
> There are lots of useful operations that if you don't have a StringRef,
> you won't ever think to use, but once you start using them, you'll never
> want to go back.  That's why I say we should always prefer StringRef unless
> there is a specific reason not to.  It doesn't prevent you from doing
> anything that you can already do with a const char*, but it allows you to
> do much more.  The converse is not true.  If you vend out const char*'s,
> you will end up with a lot of excessively verbose code, because as soon as
> someone sees a const char*, they will start looking for the appropriate C
> method to use to solve their problem, which is probably going to turn into
> a lot of difficult to read / get correct code, when there's a good chance
> that there was already a method that did exactly what they wanted in
> StringRef and would have saved a ton of code.
>
> One example that I found recently was there was a string formatted as a ;
> separated list.  Each semicolon was followed by some prefix indicating what
> followed.  One of those fields was itself a nested comma separated list.
> So it was something like this:
>
> std::string s =
> Field1=...;Field2=...;Field3=value1,value2,value3;Field4=...
>
> The code to parse this was more than one screen.  Converting it to
> StringRef and using StringRef::split() it became about 10 lines.
>
> I agree you don't always "need" it, but you never know how people are
> going to use it, and I can't think of a reason *not* to use it.  And when
> you do need it, it makes code drastically simpler.
>
> On Mon, Sep 12, 2016 at 6:24 PM Jim Ingham <jing...@apple.com> wrote:
>
> I see the whole content, but I'll reply to this one so the reply doesn't
> get truncated on your end...
>
> > On Sep 12, 2016, at 6:03 PM, Zachary Turner <ztur...@google.com> wrote:
> >
> >
> > Immediately, very little.  A small amount of performance, since
> comparing StringRefs is faster than comparing null terminated stings, since
> the length is stored with the string it can use memcmp instead of strcmp.
> >
> > From a big picture perspective, quite a lot IMO.  StringRef has numerous
> methods which are extremely useful and not easy to reproduce correctly.
> Also, If you do the comparison many times, you only pay for the length
> computation once.
> >
> > Also, this email is so long that it's truncating in my email reader.  So
> I'm not sure if the second part of my email went through.  It's possible
> your reader is better than mine at handling large emails, but I'm copying
> the answer to your second question below just in case you also didn't see
> it.
> >
> > Thanks for the comments.
> > >
> > >   I don't see the benefit of using StringRef's to return all the key
> names.  I'm generally only ever going to pass them to the StructuredData
> API's, which makes them into StringRef's transparently on the way in.  How
> would building StringRefs help here?
> >
> > >   You also suggested changing a bunch of BreakpointOption API to
> return StringRef's.  OTOH this CL just mechanically changed from m_options
> to m_options_up, so changing the API was not part of the patches intent.
> OTOH most of these options (condition, thread name, etc) can take and
> return NULL strings.  I didn't think StringRef's handled null strings well,
> was I wrong about that?  And again, what would I gain by making these
> StringRef's?  I'm just going to stash them away in some storage the object
> owns, and I'm not going to ever share the storage with the caller.  So
> least common denominator (const char *) seems a better choice here.  If the
> caller wants a StringRef they are welcome to make one.
> >
> > Right, but making the StringRef incurs a length computation.  That's not
> something you want to do over and over.  It's guaranteed someone somewhere
> is going to compute the length, so it's better to do it once upfront, and
> allow everyone else to never have to do it again.
>
> Maybe I don't understand how StringRef's work.  I thought they just
> wrapped some foreign storage - a string constant, char * or std::string?
> So for the length computation to be shared for an object handing out
> StringRef's, the object would have to keep both the string and it's
> associated StringRef.  If the functions just RETURN a StringRef that wraps
> a string constant, you'll calculate the length every time.  So IIUC:
>
> static const char *GetSerializationKey() { return "Breakpoint"; }
>
> becomes:
>
> static StringRef GetSerializationKey() { static const StringRef
> contents("Breakpoint"); return contents; }
>
> except now this has a non-trivial constructor, so I should really put a
> std::once around the initializer, right?  That just seems like way more
> trouble than it is worth to keep from computing a length a couple of times.
>
> >
> > On the other hand, using a StringRef gives you many advantages.  Unless
> you know every possible way in which these strings will ever be used, who
> knows what someone might want to do with it?  What if someone wants to take
> one of these strings, check if some other string starts with it, and chop
> it off if so?  You could either write:
> >
> > if (strncmp(GetName(), str.c_str(), strlen(GetName()) == 0)
> >     str2 = str.substr(strlen(GetName()));
> >
> > which there's a good chance will be either wrong or suboptimal, or you
> could write:
> >
> > str.consume_front(GetName());
> >
> > which is both easier to understand, obviously correct, and optimal from
> a performance standpoint.
> >
> > const char* should almost never be used for anything unless you
> absolutely must interface with a C-api for which there is no good
> equivalent on StringRef (very rare).
>
> Hum.  I would say "If you want to start parsing up a string, put it in a
> StringRef, you'll like it..."  But if you are handing out a string
> constant, "const char *" is fine, and the consumers can dress it up if they
> want.
>
> >
> > Since we currently use const char* in many places, this sometimes makies
> interfacing difficult, but as more common classes move to StringRef, this
> will go away and almost all code will become faster and easier to
> understand.
> >
> > You are right that StringRefs don't handle null strings, but they do
> handle empty strings, and it's not common that empty and null need to be
> treated as distinct values.
>
> The classes you reference treat nullptr as "not set".  I could go change
> that too, but I'd rather write some tests.
>
> Jim
>
>
>

_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Re: [Lldb-commits] [lldb] r281273 - This is the main part of a change to add breakpoint save and restore to lldb.

Reply via email to