> On Apr 20, 2016, at 3:08 PM, Jeffrey Tan <jeffrey.fu...@gmail.com> wrote: > > Hi Enrico, > > Instead of trying function-evaluation c_str(), I decided to decode the > information from fbstring_core fields. I got it working but have two > questions for it. > > type summary add -F data_formatter.folly_string_formatter -x > "std::fbstring_core<char>" > > Here is the output: > > fr v -T small > (std::string) small = "small" > > fr v -T small.store_ > (std::fbstring_core<char>) small.store_ = None > > fr v -T small.store_.ml_ > (std::fbstring_core<char>::MediumLarge) small.store_.ml_ = None > > Questions: > 1. Even I only added formatter for std::fbstring_core<char> why does it work > for std::string? > 2. Why the later small.store_ and small.store_.ml_ will show summary None > now? I would not expect the data formatter will happen to them.
Using “-x” means you added a summary for every type that matches the regular expression “std::fbstring_core<char>” If you seriously only want to match “std::fbstring_core<char>”, you should leave out the “-x” argument That should solve the third problem you are running into As for the first issue - I am not sure - I’d need to look at the data_formatter.py file > > Btw: here is the implementation of fbstring_core > https://github.com/facebook/folly/blob/master/folly/FBString.h > <https://github.com/facebook/folly/blob/master/folly/FBString.h> > > Thanks > Jeffrey > > On Wed, Apr 13, 2016 at 11:08 AM, Enrico Granata <egran...@apple.com > <mailto:egran...@apple.com>> wrote: > In theory what you're doing looks like it should be supported. I am not sure > why your example is failing the way it is. > > Is your variable a global maybe? > > Also, using the variable name is the wrong thing to do. If you have a class > with a std::string member, the name is going to return the wrong thing. You > would want to at least use the expression path - and even then there are some > cases where we can't cons up a proper expression path. > > Sent from my iPhone > > On Apr 13, 2016, at 11:02 AM, Jeffrey Tan <jeffrey.fu...@gmail.com > <mailto:jeffrey.fu...@gmail.com>> wrote: > >> I did a quick testing to call SBFrame.EvaluateExpression('string.c_str()') >> for the summary. The result shows valobj.GetFrame() returns None so does >> this mean this is not supported? >> >> def DoTest(valobj,internal_dict): >> print "valobj: %s" % valobj >> print "valobj.GetFrame(): %s" % valobj.GetFrame() >> summaryValue = valobj.GetFrame().EvaluateExpression(valobj.name >> <http://valobj.name/> + '.c_str()') >> print "summaryValue: %s" % summaryValue >> return 'Summary from c_str(): %s ' % summaryValue.GetSummary() >> >> type summary add -F DoTest -x "std::fbstring_core<char>" >> >> Output: >> valobj.GetFrame(): No value >> summaryValue: No value >> valobj: (std::string) $6 = { >> store_ = Summary from c_str(): None >> } >> >> Jeffrey >> >> On Wed, Apr 13, 2016 at 10:11 AM, Jeffrey Tan <jeffrey.fu...@gmail.com >> <mailto:jeffrey.fu...@gmail.com>> wrote: >> One quick question: do we support getting type summary string from inferior >> method call? After reading our own fbstring_core code, I found I need to >> mirror a lot of what fbstring_core.c_str() method is doing in python. I >> wonder if we can just call ${var.c_str()} as the type summary? I suspect one >> of the concern is side-effect(the inferior method may throw exception or >> cause problems) but I would not see why this can't be done. By allowing this >> we can keep the data formatter truth one copy(in source code) instead of >> potential out-of-sync(let say the std::string author decided to change it >> implementation, the python data formatter associated with it needs to be >> modified at the same time which is a maintain nightmare). >> >> Jeffrey >> >> On Thu, Apr 7, 2016 at 10:33 AM, Enrico Granata <egran...@apple.com >> <mailto:egran...@apple.com>> wrote: >> >>> On Apr 6, 2016, at 7:31 PM, Jeffrey Tan <jeffrey.fu...@gmail.com >>> <mailto:jeffrey.fu...@gmail.com>> wrote: >>> >>> Thanks Enrico. This is very detailed! I will take a look. >>> Btw: originally, I was hoping that data formatter can be added without >>> changing the source code. Like giving a xml/json format file telling lldb >>> the memory layout/structure of the data structure, lldb can parse the >>> xml/json and deduce the formatting. This is approach used by data >>> visualizer in VS debugger: >>> https://msdn.microsoft.com/en-us/library/jj620914.aspx >>> <https://msdn.microsoft.com/en-us/library/jj620914.aspx> >>> This will make adding data formatter more extensible/flexible. Any reason >>> we did not take this approach? >>> >> >> The way I understand the Natvis system, it allows one to provide a bunch of >> expressions that describe how the debugger would go about retrieving the >> interesting data bits >> This has the bonus of being really easy, since you’re writing code in the >> same language/context of the types you’re formatting >> On the other hand it has a few drawbacks, in terms of performance as well as >> safety (imagine trying to run code on an object when said object is in an >> incoherent state) >> The LLDB approach, on the other hand, is that you should try to not run code >> when providing these data formatters. In order to do that, we vend an API >> that can do things such as retrieve child values, read memory, cast values, >> …, all without code execution >> Once you have this kind of API that is not expressed in your source >> language, you might just as well describe it in a scripting language. Hence >> were born the Python data formatters. >> In order for us to gain even more performance for native system types that >> we know we’re gonna run into all the time, we then switched a bunch of the >> “mission critical” formatters from Python to C++ >> The Python extension points are still available, as Jim pointed out, and you >> are more than welcome to use those instead of modifying the debugger core >> >>> Jeffrey >>> >>> On Wed, Apr 6, 2016 at 11:49 AM, Enrico Granata <egran...@apple.com >>> <mailto:egran...@apple.com>> wrote: >>> >>>> On Apr 5, 2016, at 2:42 PM, Jeffrey Tan <jeffrey.fu...@gmail.com >>>> <mailto:jeffrey.fu...@gmail.com>> wrote: >>>> >>>> Hi Enrico, >>>> >>>> Any suggestion/example how to add a data formatter for our own STL string? >>>> From the output below I can see we are using our own "fbstring_core" which >>>> I assume I need to write a type summary for this type: >>>> >>>> frame variable corpus -T >>>> (const string &const) corpus = error: summary string parsing error: { >>>> (std::fbstring_core<char>) store_ = { >>>> (std::fbstring_core<char>::(anonymous union)) = { >>>> (char [24]) small_ = "www" >>>> (std::fbstring_core<char>::MediumLarge) ml_ = { >>>> (char *) data_ = 0x0000000000777777 >>>> "H\x89U\xa8H\x89M\xa0L\x89E\x98H\x8bE\xa8H\x89��_U��D\x88e�H\x8bE\xa0H\x89��]U��H\x89�H\x8dE�H\x89�H\x89��� >>>> ��L\x8dm�H\x8bE\x98H\x89��IU��\x88]�L\x8be\xb0L\x89�� >>>> (std::size_t) size_ = 0 >>>> (std::size_t) capacity_ = 1441151880758558720 >>>> } >>>> } >>>> } >>>> } >>>> >>> >>> Admittedly, this is going to be a little vague since I haven’t really seen >>> your code and I am only working off of one sample >>> >>> There’s going to be two parts to getting this to work: >>> >>> Part 1 - Formatting fbstring_core >>> >>> At a glance, an fbstring_core<char> can be backed by two representations. A >>> “small” representation (a char array), and a “medium/large" representation >>> (a char* + a size) >>> I assume that the way you tell one from the other is >>> >>> if (size == 0) small >>> else medium-large >>> >>> If my assumption is not correct, you’ll need to discover what the correct >>> discriminator logic is - the class has to know, and so do you :-) >>> >>> Armed with that knowledge, look in lldb >>> source/Plugins/Language/CPlusPlus/Formatters/LibCxx.cpp >>> There’s a bunch of code that deals with formatting llvm’s libc++ >>> std::string - which follows a very similar logic to your class >>> >>> ExtractLibcxxStringInfo() is the function that handles discovering which >>> layout the string uses - where the data lives - and how much data there is >>> >>> Once you have told yourself how much data there is (the size) and where it >>> lives (array or pointer), LibcxxStringSummaryProvider() has the easy task - >>> it sets up a StringPrinter, tells it how much data to print, where to get >>> it from, and then delegates the StringPrinter to do the grunt work >>> StringPrinter is a nifty little tool - it can handle generating summaries >>> for different kinds of strings (UTF8? UTF16? we got it - is a \0 a >>> terminator? what quote character would you like? …) - you point it at some >>> data, set up a few options, and it will generate a printable representation >>> for you - if your string type is doing anything out of the ordinary, let’s >>> talk - I am definitely open to extending StringPrinter to handle even more >>> magic >>> >>> Part 2 - Teaching std::string that it can be backed by an fbstring_core >>> >>> At the end of part 1, you’ll probably end up with a >>> FBStringCoreSummaryProvider() - now you need to teach LLDB about it >>> The obvious thing you could do would be to go in >>> CPlusPlusLanguage::GetFormatters() add a LoadFBStringFormatter(g_category) >>> to it - and then imitate - say - LoadLibCxxFormatters() >>> >>> AddCXXSummary(cpp_category_sp, >>> lldb_private::formatters::FBStringCoreSummaryProvider, “fbstringcore >>> summary provider", ConstString(“std::fbstring_core<.+>"), >>> stl_summary_flags, true); >>> >>> That will work - but what you would see is: >>> >>>> (const string &const) corpus = error: summary string parsing error: { >>>> (std::fbstring_core<char>) store_ = “www" >>> >>> You wanna do >>> >>> (lldb) log enable lldb formatters >>> (lldb) frame variable -T corpus >>> >>> It will list one or more typenames - the most specific one is the one you >>> like (e.g. for libc++ we get std::__1::string - this is how we tell >>> ourselves this is the std::string from libc++) >>> Once you find that typename, you’ll make a new formatter - >>> FBStringSummaryProvider() - and register that formatter with that very >>> specific typename >>> >>> All that FBStringSummaryProvider() has to do is get the “store_” member >>> (ValueObject::GetChildMemberWithName() is your friend) - and pass it down >>> to FBStringCoreSummaryProvider() >>> >>> >>> I understand this may seem a little convoluted and arcane at first - but >>> feel free to ask more questions, and I’ll try to help out! >>> >>>> Thanks. >>>> Jeffrey >>>> >>>> On Mon, Mar 28, 2016 at 11:38 AM, Enrico Granata <egran...@apple.com >>>> <mailto:egran...@apple.com>> wrote: >>>> This is kind of orthogonal to your problem, but the reason why you are not >>>> seeing the kind of simplified printing Greg is suggesting, is because your >>>> std::string doesn’t look like any of the kinds we recognize >>>> >>>> Specifically, LLDB data formatters work by matching against type names, >>>> and once they recognize a typename, then they try to inspect the variable >>>> in order to grab a summary >>>> In your example, your std::string exposes a layout that we are not >>>> handling - hence we bail out of the formatter and we fall back to the raw >>>> view >>>> >>>> If you want pretty printing to work, you’ll need to write a data formatter >>>> >>>> There are a few avenues. The obvious easy one is to extend the existing >>>> std::string formatter to recognize your type’s internal layout. >>>> If one were signing up for more infrastructure work, they could decide to >>>> try and detect shared library loads and load formatters that match with >>>> whatever libraries are being loaded. >>>> >>>>> On Mar 28, 2016, at 9:47 AM, Greg Clayton via lldb-dev >>>>> <lldb-dev@lists.llvm.org <mailto:lldb-dev@lists.llvm.org>> wrote: >>>>> >>>>> So you need to be prepared to escape any text that can have special >>>>> characters. A "std::string" or any container can contain special >>>>> characters. If you are encoding stuff into JSON, you will either need to >>>>> escape any special characters, or hex encode the string into ASCII hex >>>>> bytes. >>>>> >>>>> In debuggers we often get bogus data because variables are not >>>>> initialized, but the compiler tells us that a variable is valid in >>>>> address range [0x1000-0x2000), but it actually is [0x1200-0x2000). If we >>>>> read a variable in this case, a std::string might contain bogus data and >>>>> the bytes might not make sense. So you always have to be prepared for bad >>>>> data. >>>>> >>>>> If we look at: >>>>> >>>>> store_ = { >>>>> = { >>>>> small_ = "www" >>>>> ml_ = (data_ = >>>>> "��UH\x89�H�}�H\x8bE�]ÐUH\x89�H��H\x89}�H\x8bE�H\x89��~\xb4��\x90��UH\x89�SH\x83�H\x89}�H�u�H�E�H���\x9e���H\x8b\x18H\x8bE�H���O\xb4��H\x89ƿ\b", >>>>> size_ = 0, capacity_ = 1441151880758558720) >>>>> } >>>>> } >>>>> } >>>>> >>>>> We can see the "size_" is zero, and capacity_ is 1441151880758558720 >>>>> (which is 0x1400000000000000). "data_" seems to be some random pointer. >>>>> >>>>> On MacOSX, we have a special formatting code that displays std::string in >>>>> CPlusPlusLanguage.cpp that gets installed in the LoadLibCxxFormatters() >>>>> or LoadLibStdcppFormatters() functions with code like: >>>>> >>>>> lldb::TypeSummaryImplSP std_string_summary_sp(new >>>>> CXXFunctionSummaryFormat(stl_summary_flags, >>>>> lldb_private::formatters::LibcxxStringSummaryProvider, "std::string >>>>> summary provider")); >>>>> >>>>> cpp_category_sp->GetTypeSummariesContainer()->Add(ConstString("std::__1::string"), >>>>> std_string_summary_sp); >>>>> >>>>> Special flags are set on std::string to say "don't show children of this >>>>> and just show a summary" So if a std::string contained "hello". So for >>>>> the following code: >>>>> >>>>> std::string h ("hello"); >>>>> >>>>> You should just see: >>>>> >>>>> (lldb) fr var h >>>>> (std::__1::string) h = "hello" >>>>> >>>>> If you take a look at the normal value in the raw we see: >>>>> >>>>> (lldb) fr var --raw h >>>>> (std::__1::string) h = { >>>>> __r_ = { >>>>> std::__1::__libcpp_compressed_pair_imp<std::__1::basic_string<char, >>>>> std::__1::char_traits<char>, std::__1::allocator<char> >::__rep, >>>>> std::__1::allocator<char>, 2> = { >>>>> __first_ = { >>>>> = { >>>>> __l = { >>>>> __cap_ = 122511465736202 >>>>> __size_ = 0 >>>>> __data_ = 0x0000000000000000 >>>>> } >>>>> __s = { >>>>> = { >>>>> __size_ = '\n' >>>>> __lx = '\n' >>>>> } >>>>> __data_ = { >>>>> [0] = 'h' >>>>> [1] = 'e' >>>>> [2] = 'l' >>>>> [3] = 'l' >>>>> [4] = 'o' >>>>> [5] = '\0' >>>>> [6] = '\0' >>>>> [7] = '\0' >>>>> [8] = '\0' >>>>> [9] = '\0' >>>>> [10] = '\0' >>>>> [11] = '\0' >>>>> [12] = '\0' >>>>> [13] = '\0' >>>>> [14] = '\0' >>>>> [15] = '\0' >>>>> [16] = '\0' >>>>> [17] = '\0' >>>>> [18] = '\0' >>>>> [19] = '\0' >>>>> [20] = '\0' >>>>> [21] = '\0' >>>>> [22] = '\0' >>>>> } >>>>> } >>>>> __r = { >>>>> __words = { >>>>> [0] = 122511465736202 >>>>> [1] = 0 >>>>> [2] = 0 >>>>> } >>>>> } >>>>> } >>>>> } >>>>> } >>>>> } >>>>> } >>>>> >>>>> So the main question is why are our "std::string" formatters not kicking >>>>> in for you. That comes down to a typename match, or the format of the >>>>> string isn't what the formatter is expecting. >>>>> >>>>> But again, since you std::string can contain anything, you will need to >>>>> escape any and all text that is encoded into JSON to ensure it doesn't >>>>> contain anything JSON can't deal with. >>>>> >>>>>> On Mar 27, 2016, at 9:20 PM, Jeffrey Tan via lldb-dev >>>>>> <lldb-dev@lists.llvm.org <mailto:lldb-dev@lists.llvm.org>> wrote: >>>>>> >>>>>> Thanks Siva. All the DW_TAG_member related errors seems to go away after >>>>>> patching with your fix. The current problem is handling the decoding. >>>>>> >>>>>> Here is the correct decoding from gdb whic might be useful: >>>>>> (gdb) p corpus >>>>>> $3 = (const std::string &) @0x7fd133cfb888: { >>>>>> static npos = 18446744073709551615, store_ = { >>>>>> static kIsLittleEndian = <optimized out>, >>>>>> static kIsBigEndian = <optimized out>, { >>>>>> small_ = "www", '\000' <repeats 20 times>, "\024", ml_ = { >>>>>> data_ = 0x777777 <std::_Any_data::_M_access<void >>>>>> folly::fibers::Baton::waitFiber<folly::fibers::FirstArgOf<facebook::servicerouter::RequestDispatcherBase<facebook::servicerouter::ThriftDispatcher>::prepareForSelection(facebook::servicerouter::DispatchContext&)::{lambda(folly::fibers::Promise<facebook::servicerouter::RequestDispatcherBase<facebook::servicerouter::ThriftDispatcher>::prepareForSelection(facebook::servicerouter::DispatchContext&)::SelectionResult>)#1}, >>>>>> void>::type::value_type >>>>>> folly::fibers::await<facebook::servicerouter::RequestDispatcherBase<facebook::servicerouter::ThriftDispatcher>::prepareForSelection(facebook::servicerouter::DispatchContext&)::{lambda(folly::fibers::Promise<facebook::servicerouter::RequestDispatcherBase<facebook::servicerouter::ThriftDispatcher>::prepareForSelection(facebook::servicerouter::DispatchContext&)::SelectionResult>)#1}>(folly::fibers::FirstArgOf&&)::{lambda()#1}>(folly::fibers::FiberManager&, >>>>>> >>>>>> folly::fibers::FirstArgOf<folly::fibers::FirstArgOf<facebook::servicerouter::RequestDispatcherBase<facebook::servicerouter::ThriftDispatcher>::prepareForSelection(facebook::servicerouter::DispatchContext&)::{lambda(folly::fibers::Promise<facebook::servicerouter::RequestDispatcherBase<facebook::servicerouter::ThriftDispatcher>::prepareForSelection(facebook::servicerouter::DispatchContext&)::SelectionResult>)#1}, >>>>>> void>::type::value_type >>>>>> folly::fibers::await<facebook::servicerouter::RequestDispatcherBase<facebook::servicerouter::ThriftDispatcher>::prepareForSelection(facebook::servicerouter::DispatchContext&)::{lambda(folly::fibers::Promise<facebook::servicerouter::RequestDispatcherBase<facebook::servicerouter::ThriftDispatcher>::prepareForSelection(facebook::servicerouter::DispatchContext&)::SelectionResult>)#1}>(folly::fibers::FirstArgOf&&)::{lambda()#1}, >>>>>> void>::type::value_type)::{lambda(folly::fibers::Fiber&)#1}*>() >>>>>> const+25> >>>>>> "\311\303UH\211\345H\211}\370H\213E\370]ÐUH\211\345H\203\354\020H\211}\370H\213E\370H\211\307\350~\264\312\377\220\311\303UH\211\345SH\203\354\030H\211}\350H\211u\340H\213E\340H\211\307\350\236\377\377\377H\213\030H\213E\350H\211\307\350O\264\312\377H\211ƿ\b", >>>>>> size_ = 0, >>>>>> capacity_ = 1441151880758558720}}}} >>>>>> >>>>>> Utf-16 does not seem to decode it, while 'latin-1' does: >>>>>>>>> '\xc9'.decode('utf-16') >>>>>> Traceback (most recent call last): >>>>>> File "<stdin>", line 1, in <module> >>>>>> File >>>>>> "/mnt/gvfs/third-party2/python/55c1fd79d91c77c95932db31a4769919611c12bb/2.7.8/centos6-native/da39a3e/lib/python2.7/encodings/utf_16.py", >>>>>> line 16, in decode >>>>>> return codecs.utf_16_decode(input, errors, True) >>>>>> UnicodeDecodeError: 'utf16' codec can't decode byte 0xc9 in position 0: >>>>>> truncated data >>>>>>>>> '\xc9'.decode('latin-1') >>>>>> u'\xc9' >>>>>> >>>>>> Instead of guessing what kind of decoding I should use, I would use >>>>>> 'ensure_ascii=False' to prevent the crash for now. >>>>>> >>>>>> I tried to reproduce this crash, but it seems that the crash might be >>>>>> related with some internal stl implementation we are using. I will see >>>>>> if I can narrow down to a small repro later. >>>>>> >>>>>> Thanks >>>>>> Jeffrey >>>>>> >>>>>> On Sun, Mar 27, 2016 at 2:49 PM, Siva Chandra <sivachan...@gmail.com >>>>>> <mailto:sivachan...@gmail.com>> wrote: >>>>>> On Sat, Mar 26, 2016 at 11:58 PM, Jeffrey Tan <jeffrey.fu...@gmail.com >>>>>> <mailto:jeffrey.fu...@gmail.com>> wrote: >>>>>>> Btw: after patching with Siva's fix http://reviews.llvm.org/D18008 >>>>>>> <http://reviews.llvm.org/D18008>, the >>>>>>> first field 'small_' is fixed, however the second field 'ml_' still >>>>>>> emits >>>>>>> garbage: >>>>>>> >>>>>>> (lldb) fr v corpus >>>>>>> (const string &const) corpus = error: summary string parsing error: { >>>>>>> store_ = { >>>>>>> = { >>>>>>> small_ = "www" >>>>>>> ml_ = (data_ = >>>>>>> "��UH\x89�H�}�H\x8bE�]ÐUH\x89�H��H\x89}�H\x8bE�H\x89��~\xb4��\x90��UH\x89�SH\x83�H\x89}�H�u�H�E�H���\x9e���H\x8b\x18H\x8bE�H���O\xb4��H\x89ƿ\b", >>>>>>> size_ = 0, capacity_ = 1441151880758558720) >>>>>>> } >>>>>>> } >>>>>>> } >>>>>> >>>>>> Do you still see the DW_TAG_member related error? >>>>>> >>>>>> A wild (and really wild at that) guess: Is it utf16 data that is being >>>>>> decoded as utf8? >>>>>> >>>>>> As David Blaikie mentioned on the other thread, it would really help >>>>>> if you provide us with a minimal example to repro this. Atleast, repro >>>>>> instructions. >>>>>> >>>>>> _______________________________________________ >>>>>> lldb-dev mailing list >>>>>> lldb-dev@lists.llvm.org <mailto:lldb-dev@lists.llvm.org> >>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev >>>>>> <http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev> >>>>> >>>>> _______________________________________________ >>>>> lldb-dev mailing list >>>>> lldb-dev@lists.llvm.org <mailto:lldb-dev@lists.llvm.org> >>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev >>>>> <http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev> >>>> >>>> >>>> Thanks, >>>> - Enrico >>>> 📩 egranata@.com ☎️ 27683 >>>> >>>> >>> >>> >>> >>> Thanks, >>> - Enrico >>> 📩 egranata@.com ☎️ 27683 >>> >>> >> >> >> >> Thanks, >> - Enrico >> 📩 egranata@.com ☎️ 27683 >> >> >> > Thanks, - Enrico 📩 egranata@.com ☎️ 27683
_______________________________________________ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev