[lldb-dev] UnicodeDecodeError for serialize SBValue description
Follow-up for the previous question: Our python code is trying to call json.dumps to serialize the variable evaluation result into string block and send to IDE via RPC, however it failed with "UnicodeDecodeError: 'utf8' codec can't decode byte 0xc9 in position 10: invalid continuation byte" because SBValue.description seems to return non-utf-8 string: (lldb) fr v *error: biggrep_master_server_async 0x10b9a91a: DW_TAG_member '_M_pod_data' refers to type 0x10bb1e99 which extends beyond the bounds of 0x10b9a901* *error: biggrep_master_server_async 0x10b98edc: DW_TAG_member 'small_' refers to type 0x10bb1d9f which extends beyond the bounds of 0x10b98ed3* *error: biggrep_master_server_async 0x10baf034: DW_TAG_member '__size' refers to type 0x10baf04d which extends beyond the bounds of 0x10baefae* (facebook::biggrep::BigGrepMasterAsync *) this = 0x7fd14d374fd0 (const string &const) corpus = error: summary string parsing error: { store_ = { = { small_ = {} *ml_ = (data_ = "��UH\x89�H�}�H\x8bE�]ÐUH\x89�H��H\x89}�H\x8bE�H\x89��~\xb4��\x90��UH\x89�SH\x83�H\x89}�H�u�H�E�H���\x9e���H\x8b\x18H\x8bE�H���O\xb4��H\x89ƿ\b", size_ = 0, capacity_ = 1441151880758558720)* } } } File "/data/users/jeffreytan/fbsource/fbobjc/Tools/Nuclide/pkg/nuclide-debugger-lldb-server/scripts/chromedebugger.py", line 91, in received_message *response_in_json = json.dumps(response);* File "/usr/lib64/python2.6/json/__init__.py", line 230, in dumps return _default_encoder.encode(obj) File "/usr/lib64/python2.6/json/encoder.py", line 367, in encode chunks = list(self.iterencode(o)) File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode for chunk in self._iterencode_dict(o, markers): File "/usr/lib64/python2.6/json/encoder.py", line 275, in _iterencode_dict for chunk in self._iterencode(value, markers): File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode for chunk in self._iterencode_dict(o, markers): File "/usr/lib64/python2.6/json/encoder.py", line 275, in _iterencode_dict for chunk in self._iterencode(value, markers): File "/usr/lib64/python2.6/json/encoder.py", line 306, in _iterencode for chunk in self._iterencode_list(o, markers): File "/usr/lib64/python2.6/json/encoder.py", line 204, in _iterencode_list for chunk in self._iterencode(value, markers): File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode for chunk in self._iterencode_dict(o, markers): File "/usr/lib64/python2.6/json/encoder.py", line 275, in _iterencode_dict for chunk in self._iterencode(value, markers): File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode for chunk in self._iterencode_dict(o, markers): File "/usr/lib64/python2.6/json/encoder.py", line 275, in _iterencode_dict for chunk in self._iterencode(value, markers): File "/usr/lib64/python2.6/json/encoder.py", line 294, in _iterencode yield encoder(o) *UnicodeDecodeError: 'utf8' codec can't decode byte 0xc9 in position 10: invalid continuation byte* Question: Is the non utf-8 string expected or just gabage data because of the DW_TAG_member error? What is the proper way find out the string encoding and serialize using* json.dumps()*? Jeffrey ___ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
[lldb-dev] Fwd: DW_TAG_member extends beyond the bounds error on Linux
Sorry, sent to the wrong alias. -- Forwarded message -- From: Jeffrey Tan Date: Sat, Mar 26, 2016 at 3:19 PM Subject: DW_TAG_member extends beyond the bounds error on Linux To: llvm-...@lists.llvm.org Hi, While dogfooding our lldb based IDE on Linux, I am seeing a lot of variable evaluation errors related to DW_TAG_member which prevents us from release the IDE. Can anyone help to confirm if they are known issues? If not, any information you need to troubleshoot this issue? Here is one example: (lldb) fr v *error: biggrep_master_server_async 0x10b9a91a: DW_TAG_member '_M_pod_data' refers to type 0x10bb1e99 which extends beyond the bounds of 0x10b9a901* *error: biggrep_master_server_async 0x10b98edc: DW_TAG_member 'small_' refers to type 0x10bb1d9f which extends beyond the bounds of 0x10b98ed3* *error: biggrep_master_server_async 0x10baf034: DW_TAG_member '__size' refers to type 0x10baf04d which extends beyond the bounds of 0x10baefae* (facebook::biggrep::BigGrepMasterAsync *) this = 0x7fd14d374fd0 (const string &const) corpus = error: summary string parsing error: { store_ = { = { small_ = {} *ml_ = (data_ = "��UH\x89�H�}�H\x8bE�]ÐUH\x89�H��H\x89}�H\x8bE�H\x89��~\xb4��\x90��UH\x89�SH\x83�H\x89}�H�u�H�E�H���\x9e���H\x8b\x18H\x8bE�H���O\xb4��H\x89ƿ\b", size_ = 0, capacity_ = 1441151880758558720)* } } } *(const string &const) needle = error: summary string parsing error: {* store_ = { = { small_ = {} ml_ = (data_ = "", size_ = 0, capacity_ = 1080863910568919040) } } } (facebook::biggrep::Options &) options = 0x7fd133cfb7b0: { engine = error: summary string parsing error full_lines = true user = error: summary string parsing error max_bytes = 500 leading_context = 0 trailing_context = 0 case_sensitive = true client_hostname = error: summary string parsing error client_ip = error: summary string parsing error skip_logging = false client_port = 0 shards_override = 0 sample = false count = false filename_pattern = error: summary string parsing error limit = 0 __isset = { engine = true full_lines = true user = true max_bytes = true leading_context = true trailing_context = true case_sensitive = true client_hostname = true client_ip = true skip_logging = true client_port = true shards_override = true sample = true count = true filename_pattern = true limit = true } } (size_t) recv_timeout = 140536468041728 (std::vector, std::allocator, std::fbstring_core >, std::allocator, std::allocator, std::fbstring_core > > >) corpuses = size=0 {} (std::vector >) revisions = size=0 {} (std::vector >) shards = size=0 {} *(std::string) returnRev = error: summary string parsing error* () quote = {} (std::basic_fbstring, std::allocator, std::fbstring_core >) desc = { store_ = { = { small_ = {} ml_ = (data_ = "", size_ = 73415312, capacity_ = 140536494141696) } } } (folly::EventBase *) eb = 0x7fd133cfb888 (apache::thrift::concurrency::ThreadManager *) tm = 0x7fd133cfb570 I suspect each one may be different root cause. I was able to capture one callstack of a small repro: Breakpoint 1, DWARFASTParserClang::ParseChildMembers (this=0x8c4520, sc=..., parent_die=..., class_clang_type=..., class_language=lldb::eLanguageTypeUnknown, base_classes=..., member_accessibilities=..., member_function_dies=..., delayed_properties=..., default_accessibility=@0x7ffdf3888cac: lldb::eAccessPublic, is_a_class=@0x7ffdf3888cab: false, layout_info=...) at /home/engshare/third-party2/lldb/3.8.0.rc3/src/llvm/tools/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp:2937 2937 parent_die.GetID()); (gdb) bt #0 0x7f103d02533d in DWARFASTParserClang::ParseChildMembers(lldb_private::SymbolContext const&, DWARFDIE const&, lldb_private::CompilerType&, lldb::LanguageType, std::vector >&, std::vector >&, DWARFDIECollection&, std::vector >&, lldb::AccessType&, bool&, DWARFASTParserClang::LayoutInfo&) (this=0x8c4520, sc=..., parent_die=..., class_clang_type=..., class_language=lldb::eLanguageTypeUnknown, base_classes=..., member_accessibilities=..., member_function_dies=..., delayed_properties=..., default_accessibility=@0x7ffdf3888cac: lldb::eAccessPublic, is_a_class=@0x7ffdf3888cab: false, layout_info=...) at /home/engshare/third-party2/lldb/3.8.0.rc3/src/llvm/tools/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp:2937 #1 0x7f103d025b84 in DWARFASTParserClang::CompleteTypeFromDWARF(DWARFDIE const&, lldb_private::Type*, lldb_private::CompilerType&) (this=0x8c4520, die=..., type=0xc40a50, clang_type=...) at /home/engshare/third-party2/lldb/3.8.0.rc3/src/llvm/tools/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp:2036 #2 0x7f103d04c5e8 in SymbolFileDWARF::CompleteType(lldb_private::CompilerType&) (this=, compiler_type=...) at /home/engshare/third-party2/lldb/3.8.0.rc3/src/llvm/tools/lldb
Re: [lldb-dev] [llvm-dev] DW_TAG_member extends beyond the bounds error on Linux
Thanks David. I meant to send to lldb maillist, but glad to hear response here. Our binary is built from gcc: String dump of section '.comment': [ 1] GCC: (GNU) 4.9.x-google 20150123 (prerelease) Is there any similar flags we should use? By doing "strings -a [binary] | grep -i gcc", I found the following flags being used: GNU C++ 4.9.x-google 20150123 (prerelease) -momit-leaf-frame-pointer -m64 -mtune=generic -march=x86-64 -g -O3 -O3 -std=gnu++11 -ffunction-sections -fdata-sections -fstack-protector -fno-omit-frame-pointer -fdebug-prefix-map=/home/engshare/third-party2/icu/53.1/src/icu=/home/engshare/third-party2/icu/53.1/src/icu -fdebug-prefix-map=/home/engshare/third-party2/icu/53.1/src/build-gcc-4.9-glibc-2.20-fb/no-pic=/home/engshare/third-party2/icu/53.1/src/icu -fno-strict-aliasing --param ssp-buffer-size=4 Also, per reading https://gcc.gnu.org/onlinedocs/gcc-3.3.6/gcc/Debugging-Options.html, seems that we should use "-gdwarf-2" to generate only standard DWARF format? I think I might need to chat with our build team but want to know which flag I need ask them first. Btw: I tried gdb against the same binary which seems to get better result: (gdb) p corpus $3 = (const std::string &) @0x7fd133cfb888: { static npos = 18446744073709551615, store_ = { static kIsLittleEndian = , static kIsBigEndian = , { small_ = "www", '\000' , "\024", ml_ = { data_ = 0x77 ::prepareForSelection(facebook::servicerouter::DispatchContext&)::{lambda(folly::fibers::Promise::prepareForSelection(facebook::servicerouter::DispatchContext&)::SelectionResult>)#1}, void>::type::value_type folly::fibers::await::prepareForSelection(facebook::servicerouter::DispatchContext&)::{lambda(folly::fibers::Promise::prepareForSelection(facebook::servicerouter::DispatchContext&)::SelectionResult>)#1}>(folly::fibers::FirstArgOf&&)::{lambda()#1}>(folly::fibers::FiberManager&, folly::fibers::FirstArgOf::prepareForSelection(facebook::servicerouter::DispatchContext&)::{lambda(folly::fibers::Promise::prepareForSelection(facebook::servicerouter::DispatchContext&)::SelectionResult>)#1}, void>::type::value_type folly::fibers::await::prepareForSelection(facebook::servicerouter::DispatchContext&)::{lambda(folly::fibers::Promise::prepareForSelection(facebook::servicerouter::DispatchContext&)::SelectionResult>)#1}>(folly::fibers::FirstArgOf&&)::{lambda()#1}, void>::type::value_type)::{lambda(folly::fibers::Fiber&)#1}*>() const+25> "\311\303UH\211\345H\211}\370H\213E\370]ÐUH\211\345H\203\354\020H\211}\370H\213E\370H\211\307\350~\264\312\377\220\311\303UH\211\345SH\203\354\030H\211}\350H\211u\340H\213E\340H\211\307\350\236\377\377\377H\213\030H\213E\350H\211\307\350O\264\312\377H\211ƿ\b", size_ = 0, capacity_ = 1441151880758558720 Jeffrey On Sat, Mar 26, 2016 at 8:22 PM, David Blaikie wrote: > If you're going to use clang built binaries with lldb, you'll want to pass > -fstandalone-debug - this is the default on platforms where lldb is the > primary debugger (Darwin and freebsd) > > Not sure if that is the problem you are seeing, but will be a problem > sooner or later > On Mar 26, 2016 4:16 PM, "Jeffrey Tan via llvm-dev" < > llvm-...@lists.llvm.org> wrote: > >> Hi, >> >> While dogfooding our lldb based IDE on Linux, I am seeing a lot of >> variable evaluation errors related to DW_TAG_member which prevents us from >> release the IDE. Can anyone help to confirm if they are known issues? If >> not, any information you need to troubleshoot this issue? >> >> Here is one example: >> >> (lldb) fr v >> *error: biggrep_master_server_async 0x10b9a91a: DW_TAG_member >> '_M_pod_data' refers to type 0x10bb1e99 which extends beyond the bounds of >> 0x10b9a901* >> *error: biggrep_master_server_async 0x10b98edc: DW_TAG_member 'small_' >> refers to type 0x10bb1d9f which extends beyond the bounds of 0x10b98ed3* >> *error: biggrep_master_server_async 0x10baf034: DW_TAG_member '__size' >> refers to type 0x10baf04d which extends beyond the bounds of 0x10baefae* >> (facebook::biggrep::BigGrepMasterAsync *) this = 0x7fd14d374fd0 >> (const string &const) corpus = error: summary string parsing error: { >> store_ = { >> = { >> small_ = {} >> *ml_ = (data_ = >> "��UH\x89�H�}�H\x8bE�]ÐUH\x89�H��H\x89}�H\x8bE�H\x89��~\xb4��\x90��UH\x89�SH\x83�H\x89}�H�u�H�E�H���\x9e���H\x8b\x18H\x8bE�H���O\xb4��H\x89ƿ\b", >> size_ = 0, capacity_ = 1441151880758558720)* >> } >> } >> } >> *(const string &const) needle = error: summary string parsing error: {* >> store_ = { >> = { >> small_ = {} >> ml_ = (data_ = "", size_ = 0, capacity_ = 1080863910568919040) >> } >> } >> } >> (facebook::biggrep::Options &) options = 0x7fd133cfb7b0: { >> engine = error: summary string parsing error >> full_lines = true >> user = error: summary string parsing error >> max_bytes = 500 >> leading_context = 0 >> trailing_context = 0 >> case_sensitive = true >> client_hostname = error: summary st
Re: [lldb-dev] UnicodeDecodeError for serialize SBValue description
Btw: after patching with Siva's fix http://reviews.llvm.org/D18008, the first field 'small_' is fixed, however the second field 'ml_' still emits garbage: (lldb) fr v corpus (const string &const) corpus = error: summary string parsing error: { store_ = { = { small_ = "www" ml_ = (data_ = "��UH\x89�H�}�H\x8bE�]ÐUH\x89�H��H\x89}�H\x8bE�H\x89��~\xb4��\x90��UH\x89�SH\x83�H\x89}�H�u�H�E�H���\x9e���H\x8b\x18H\x8bE�H���O\xb4��H\x89ƿ\b", size_ = 0, capacity_ = 1441151880758558720) } } } Thanks for any info regarding how to encode this string. Jeffrey On Sat, Mar 26, 2016 at 3:34 PM, Jeffrey Tan wrote: > Follow-up for the previous question: > > Our python code is trying to call json.dumps to serialize the variable > evaluation result into string block and send to IDE via RPC, however it > failed with "UnicodeDecodeError: 'utf8' codec can't decode byte 0xc9 in > position 10: invalid continuation byte" because SBValue.description seems > to return non-utf-8 string: > > (lldb) fr v > *error: biggrep_master_server_async 0x10b9a91a: DW_TAG_member > '_M_pod_data' refers to type 0x10bb1e99 which extends beyond the bounds of > 0x10b9a901* > *error: biggrep_master_server_async 0x10b98edc: DW_TAG_member 'small_' > refers to type 0x10bb1d9f which extends beyond the bounds of 0x10b98ed3* > *error: biggrep_master_server_async 0x10baf034: DW_TAG_member '__size' > refers to type 0x10baf04d which extends beyond the bounds of 0x10baefae* > (facebook::biggrep::BigGrepMasterAsync *) this = 0x7fd14d374fd0 > (const string &const) corpus = error: summary string parsing error: { > store_ = { > = { > small_ = {} > *ml_ = (data_ = > "��UH\x89�H�}�H\x8bE�]ÐUH\x89�H��H\x89}�H\x8bE�H\x89��~\xb4��\x90��UH\x89�SH\x83�H\x89}�H�u�H�E�H���\x9e���H\x8b\x18H\x8bE�H���O\xb4��H\x89ƿ\b", > size_ = 0, capacity_ = 1441151880758558720)* > } > } > } > > > File > "/data/users/jeffreytan/fbsource/fbobjc/Tools/Nuclide/pkg/nuclide-debugger-lldb-server/scripts/chromedebugger.py", > line 91, in received_message > *response_in_json = json.dumps(response);* > File "/usr/lib64/python2.6/json/__init__.py", line 230, in dumps > return _default_encoder.encode(obj) > File "/usr/lib64/python2.6/json/encoder.py", line 367, in encode > chunks = list(self.iterencode(o)) > File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode > for chunk in self._iterencode_dict(o, markers): > File "/usr/lib64/python2.6/json/encoder.py", line 275, in > _iterencode_dict > for chunk in self._iterencode(value, markers): > File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode > for chunk in self._iterencode_dict(o, markers): > File "/usr/lib64/python2.6/json/encoder.py", line 275, in > _iterencode_dict > for chunk in self._iterencode(value, markers): > File "/usr/lib64/python2.6/json/encoder.py", line 306, in _iterencode > for chunk in self._iterencode_list(o, markers): > File "/usr/lib64/python2.6/json/encoder.py", line 204, in > _iterencode_list > for chunk in self._iterencode(value, markers): > File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode > for chunk in self._iterencode_dict(o, markers): > File "/usr/lib64/python2.6/json/encoder.py", line 275, in > _iterencode_dict > for chunk in self._iterencode(value, markers): > File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode > for chunk in self._iterencode_dict(o, markers): > File "/usr/lib64/python2.6/json/encoder.py", line 275, in > _iterencode_dict > for chunk in self._iterencode(value, markers): > File "/usr/lib64/python2.6/json/encoder.py", line 294, in _iterencode > yield encoder(o) > *UnicodeDecodeError: 'utf8' codec can't decode byte 0xc9 in position 10: > invalid continuation byte* > > Question: > Is the non utf-8 string expected or just gabage data because of the > DW_TAG_member > error? What is the proper way find out the string encoding and serialize > using* json.dumps()*? > > Jeffrey > ___ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev