[lldb-dev] UnicodeDecodeError for serialize SBValue description

2016-03-26 Thread Jeffrey Tan via lldb-dev
Follow-up for the previous question:

Our python code is trying to call json.dumps to serialize the variable
evaluation result into string block and send to IDE via RPC, however it
failed with "UnicodeDecodeError: 'utf8' codec can't decode byte 0xc9 in
position 10: invalid continuation byte" because SBValue.description seems
to return non-utf-8 string:

(lldb) fr v
*error: biggrep_master_server_async 0x10b9a91a: DW_TAG_member '_M_pod_data'
refers to type 0x10bb1e99 which extends beyond the bounds of 0x10b9a901*
*error: biggrep_master_server_async 0x10b98edc: DW_TAG_member 'small_'
refers to type 0x10bb1d9f which extends beyond the bounds of 0x10b98ed3*
*error: biggrep_master_server_async 0x10baf034: DW_TAG_member '__size'
refers to type 0x10baf04d which extends beyond the bounds of 0x10baefae*
(facebook::biggrep::BigGrepMasterAsync *) this = 0x7fd14d374fd0
(const string &const) corpus = error: summary string parsing error: {
  store_ = {
 = {
  small_ = {}
  *ml_ = (data_ =
"��UH\x89�H�}�H\x8bE�]ÐUH\x89�H��H\x89}�H\x8bE�H\x89��~\xb4��\x90��UH\x89�SH\x83�H\x89}�H�u�H�E�H���\x9e���H\x8b\x18H\x8bE�H���O\xb4��H\x89ƿ\b",
size_ = 0, capacity_ = 1441151880758558720)*
}
  }
}


File
"/data/users/jeffreytan/fbsource/fbobjc/Tools/Nuclide/pkg/nuclide-debugger-lldb-server/scripts/chromedebugger.py",
line 91, in received_message
*response_in_json = json.dumps(response);*
  File "/usr/lib64/python2.6/json/__init__.py", line 230, in dumps
return _default_encoder.encode(obj)
  File "/usr/lib64/python2.6/json/encoder.py", line 367, in encode
chunks = list(self.iterencode(o))
  File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode
for chunk in self._iterencode_dict(o, markers):
  File "/usr/lib64/python2.6/json/encoder.py", line 275, in _iterencode_dict
for chunk in self._iterencode(value, markers):
  File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode
for chunk in self._iterencode_dict(o, markers):
  File "/usr/lib64/python2.6/json/encoder.py", line 275, in _iterencode_dict
for chunk in self._iterencode(value, markers):
  File "/usr/lib64/python2.6/json/encoder.py", line 306, in _iterencode
for chunk in self._iterencode_list(o, markers):
  File "/usr/lib64/python2.6/json/encoder.py", line 204, in _iterencode_list
for chunk in self._iterencode(value, markers):
  File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode
for chunk in self._iterencode_dict(o, markers):
  File "/usr/lib64/python2.6/json/encoder.py", line 275, in _iterencode_dict
for chunk in self._iterencode(value, markers):
  File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode
for chunk in self._iterencode_dict(o, markers):
  File "/usr/lib64/python2.6/json/encoder.py", line 275, in _iterencode_dict
for chunk in self._iterencode(value, markers):
  File "/usr/lib64/python2.6/json/encoder.py", line 294, in _iterencode
yield encoder(o)
*UnicodeDecodeError: 'utf8' codec can't decode byte 0xc9 in position 10:
invalid continuation byte*

Question:
Is the non utf-8 string expected or just gabage data because of the
DW_TAG_member
error? What is the proper way find out the string encoding and serialize
using* json.dumps()*?

Jeffrey
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


[lldb-dev] Fwd: DW_TAG_member extends beyond the bounds error on Linux

2016-03-26 Thread Jeffrey Tan via lldb-dev
Sorry, sent to the wrong alias.

-- Forwarded message --
From: Jeffrey Tan 
Date: Sat, Mar 26, 2016 at 3:19 PM
Subject: DW_TAG_member extends beyond the bounds error on Linux
To: llvm-...@lists.llvm.org


Hi,

While dogfooding our lldb based IDE on Linux, I am seeing a lot of variable
evaluation errors related to DW_TAG_member which prevents us from release
the IDE. Can anyone help to confirm if they are known issues? If not, any
information you need to troubleshoot this issue?

Here is one example:

(lldb) fr v
*error: biggrep_master_server_async 0x10b9a91a: DW_TAG_member '_M_pod_data'
refers to type 0x10bb1e99 which extends beyond the bounds of 0x10b9a901*
*error: biggrep_master_server_async 0x10b98edc: DW_TAG_member 'small_'
refers to type 0x10bb1d9f which extends beyond the bounds of 0x10b98ed3*
*error: biggrep_master_server_async 0x10baf034: DW_TAG_member '__size'
refers to type 0x10baf04d which extends beyond the bounds of 0x10baefae*
(facebook::biggrep::BigGrepMasterAsync *) this = 0x7fd14d374fd0
(const string &const) corpus = error: summary string parsing error: {
  store_ = {
 = {
  small_ = {}
  *ml_ = (data_ =
"��UH\x89�H�}�H\x8bE�]ÐUH\x89�H��H\x89}�H\x8bE�H\x89��~\xb4��\x90��UH\x89�SH\x83�H\x89}�H�u�H�E�H���\x9e���H\x8b\x18H\x8bE�H���O\xb4��H\x89ƿ\b",
size_ = 0, capacity_ = 1441151880758558720)*
}
  }
}
*(const string &const) needle = error: summary string parsing error: {*
  store_ = {
 = {
  small_ = {}
  ml_ = (data_ = "", size_ = 0, capacity_ = 1080863910568919040)
}
  }
}
(facebook::biggrep::Options &) options = 0x7fd133cfb7b0: {
  engine = error: summary string parsing error
  full_lines = true
  user = error: summary string parsing error
  max_bytes = 500
  leading_context = 0
  trailing_context = 0
  case_sensitive = true
  client_hostname = error: summary string parsing error
  client_ip = error: summary string parsing error
  skip_logging = false
  client_port = 0
  shards_override = 0
  sample = false
  count = false
  filename_pattern = error: summary string parsing error
  limit = 0
  __isset = {
engine = true
full_lines = true
user = true
max_bytes = true
leading_context = true
trailing_context = true
case_sensitive = true
client_hostname = true
client_ip = true
skip_logging = true
client_port = true
shards_override = true
sample = true
count = true
filename_pattern = true
limit = true
  }
}
(size_t) recv_timeout = 140536468041728
(std::vector,
std::allocator, std::fbstring_core >,
std::allocator,
std::allocator, std::fbstring_core > > >) corpuses = size=0 {}
(std::vector >)
revisions = size=0 {}
(std::vector >)
shards = size=0 {}
*(std::string) returnRev = error: summary string parsing error*
() quote = {}
(std::basic_fbstring, std::allocator,
std::fbstring_core >) desc = {
  store_ = {
 = {
  small_ = {}
  ml_ = (data_ = "", size_ = 73415312, capacity_ = 140536494141696)
}
  }
}
(folly::EventBase *) eb = 0x7fd133cfb888
(apache::thrift::concurrency::ThreadManager *) tm = 0x7fd133cfb570

I suspect each one may be different root cause. I was able to capture one
callstack of a small repro:

Breakpoint 1, DWARFASTParserClang::ParseChildMembers (this=0x8c4520,
sc=..., parent_die=..., class_clang_type=...,
class_language=lldb::eLanguageTypeUnknown,
base_classes=..., member_accessibilities=..., member_function_dies=...,
delayed_properties=...,
default_accessibility=@0x7ffdf3888cac: lldb::eAccessPublic,
is_a_class=@0x7ffdf3888cab: false, layout_info=...)
at
/home/engshare/third-party2/lldb/3.8.0.rc3/src/llvm/tools/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp:2937
2937
 parent_die.GetID());
(gdb) bt
#0  0x7f103d02533d in
DWARFASTParserClang::ParseChildMembers(lldb_private::SymbolContext const&,
DWARFDIE const&, lldb_private::CompilerType&, lldb::LanguageType,
std::vector >&, std::vector >&, DWARFDIECollection&,
std::vector >&,
lldb::AccessType&, bool&, DWARFASTParserClang::LayoutInfo&) (this=0x8c4520,
sc=..., parent_die=..., class_clang_type=...,
class_language=lldb::eLanguageTypeUnknown, base_classes=...,
member_accessibilities=..., member_function_dies=...,
delayed_properties=..., default_accessibility=@0x7ffdf3888cac:
lldb::eAccessPublic, is_a_class=@0x7ffdf3888cab: false, layout_info=...) at
/home/engshare/third-party2/lldb/3.8.0.rc3/src/llvm/tools/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp:2937
#1  0x7f103d025b84 in
DWARFASTParserClang::CompleteTypeFromDWARF(DWARFDIE const&,
lldb_private::Type*, lldb_private::CompilerType&) (this=0x8c4520, die=...,
type=0xc40a50, clang_type=...)
at
/home/engshare/third-party2/lldb/3.8.0.rc3/src/llvm/tools/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp:2036
#2  0x7f103d04c5e8 in
SymbolFileDWARF::CompleteType(lldb_private::CompilerType&) (this=, compiler_type=...)
at
/home/engshare/third-party2/lldb/3.8.0.rc3/src/llvm/tools/lldb

Re: [lldb-dev] [llvm-dev] DW_TAG_member extends beyond the bounds error on Linux

2016-03-26 Thread Jeffrey Tan via lldb-dev
Thanks David. I meant to send to lldb maillist, but glad to hear response
here.

Our binary is built from gcc:
String dump of section '.comment':
  [ 1]  GCC: (GNU) 4.9.x-google 20150123 (prerelease)

Is there any similar flags we should use? By doing "strings -a [binary] |
grep -i gcc", I found the following flags being used:
GNU C++ 4.9.x-google 20150123 (prerelease) -momit-leaf-frame-pointer -m64
-mtune=generic -march=x86-64 -g -O3 -O3 -std=gnu++11 -ffunction-sections
-fdata-sections -fstack-protector -fno-omit-frame-pointer
-fdebug-prefix-map=/home/engshare/third-party2/icu/53.1/src/icu=/home/engshare/third-party2/icu/53.1/src/icu
-fdebug-prefix-map=/home/engshare/third-party2/icu/53.1/src/build-gcc-4.9-glibc-2.20-fb/no-pic=/home/engshare/third-party2/icu/53.1/src/icu
-fno-strict-aliasing --param ssp-buffer-size=4

Also, per reading
https://gcc.gnu.org/onlinedocs/gcc-3.3.6/gcc/Debugging-Options.html, seems
that we should use "-gdwarf-2" to generate only standard DWARF format? I
think I might need to chat with our build team but want to know which flag
I need ask them first.

Btw: I tried gdb against the same binary which seems to get better result:

(gdb) p corpus
$3 = (const std::string &) @0x7fd133cfb888: {
  static npos = 18446744073709551615, store_ = {
static kIsLittleEndian = ,
static kIsBigEndian = , {
  small_ = "www", '\000' , "\024", ml_ = {
data_ = 0x77 ::prepareForSelection(facebook::servicerouter::DispatchContext&)::{lambda(folly::fibers::Promise::prepareForSelection(facebook::servicerouter::DispatchContext&)::SelectionResult>)#1},
void>::type::value_type
folly::fibers::await::prepareForSelection(facebook::servicerouter::DispatchContext&)::{lambda(folly::fibers::Promise::prepareForSelection(facebook::servicerouter::DispatchContext&)::SelectionResult>)#1}>(folly::fibers::FirstArgOf&&)::{lambda()#1}>(folly::fibers::FiberManager&,
folly::fibers::FirstArgOf::prepareForSelection(facebook::servicerouter::DispatchContext&)::{lambda(folly::fibers::Promise::prepareForSelection(facebook::servicerouter::DispatchContext&)::SelectionResult>)#1},
void>::type::value_type
folly::fibers::await::prepareForSelection(facebook::servicerouter::DispatchContext&)::{lambda(folly::fibers::Promise::prepareForSelection(facebook::servicerouter::DispatchContext&)::SelectionResult>)#1}>(folly::fibers::FirstArgOf&&)::{lambda()#1},
void>::type::value_type)::{lambda(folly::fibers::Fiber&)#1}*>() const+25>
"\311\303UH\211\345H\211}\370H\213E\370]ÐUH\211\345H\203\354\020H\211}\370H\213E\370H\211\307\350~\264\312\377\220\311\303UH\211\345SH\203\354\030H\211}\350H\211u\340H\213E\340H\211\307\350\236\377\377\377H\213\030H\213E\350H\211\307\350O\264\312\377H\211ƿ\b",
size_ = 0,
capacity_ = 1441151880758558720

Jeffrey



On Sat, Mar 26, 2016 at 8:22 PM, David Blaikie  wrote:

> If you're going to use clang built binaries with lldb, you'll want to pass
> -fstandalone-debug - this is the default on platforms where lldb is the
> primary debugger (Darwin and freebsd)
>
> Not sure if that is the problem you are seeing, but will be a problem
> sooner or later
> On Mar 26, 2016 4:16 PM, "Jeffrey Tan via llvm-dev" <
> llvm-...@lists.llvm.org> wrote:
>
>> Hi,
>>
>> While dogfooding our lldb based IDE on Linux, I am seeing a lot of
>> variable evaluation errors related to DW_TAG_member which prevents us from
>> release the IDE. Can anyone help to confirm if they are known issues? If
>> not, any information you need to troubleshoot this issue?
>>
>> Here is one example:
>>
>> (lldb) fr v
>> *error: biggrep_master_server_async 0x10b9a91a: DW_TAG_member
>> '_M_pod_data' refers to type 0x10bb1e99 which extends beyond the bounds of
>> 0x10b9a901*
>> *error: biggrep_master_server_async 0x10b98edc: DW_TAG_member 'small_'
>> refers to type 0x10bb1d9f which extends beyond the bounds of 0x10b98ed3*
>> *error: biggrep_master_server_async 0x10baf034: DW_TAG_member '__size'
>> refers to type 0x10baf04d which extends beyond the bounds of 0x10baefae*
>> (facebook::biggrep::BigGrepMasterAsync *) this = 0x7fd14d374fd0
>> (const string &const) corpus = error: summary string parsing error: {
>>   store_ = {
>>  = {
>>   small_ = {}
>>   *ml_ = (data_ =
>> "��UH\x89�H�}�H\x8bE�]ÐUH\x89�H��H\x89}�H\x8bE�H\x89��~\xb4��\x90��UH\x89�SH\x83�H\x89}�H�u�H�E�H���\x9e���H\x8b\x18H\x8bE�H���O\xb4��H\x89ƿ\b",
>> size_ = 0, capacity_ = 1441151880758558720)*
>> }
>>   }
>> }
>> *(const string &const) needle = error: summary string parsing error: {*
>>   store_ = {
>>  = {
>>   small_ = {}
>>   ml_ = (data_ = "", size_ = 0, capacity_ = 1080863910568919040)
>> }
>>   }
>> }
>> (facebook::biggrep::Options &) options = 0x7fd133cfb7b0: {
>>   engine = error: summary string parsing error
>>   full_lines = true
>>   user = error: summary string parsing error
>>   max_bytes = 500
>>   leading_context = 0
>>   trailing_context = 0
>>   case_sensitive = true
>>   client_hostname = error: summary st

Re: [lldb-dev] UnicodeDecodeError for serialize SBValue description

2016-03-26 Thread Jeffrey Tan via lldb-dev
Btw: after patching with Siva's fix http://reviews.llvm.org/D18008, the
first field 'small_' is fixed, however the second field 'ml_' still emits
garbage:

(lldb) fr v corpus
(const string &const) corpus = error: summary string parsing error: {
  store_ = {
 = {
  small_ = "www"
  ml_ = (data_ =
"��UH\x89�H�}�H\x8bE�]ÐUH\x89�H��H\x89}�H\x8bE�H\x89��~\xb4��\x90��UH\x89�SH\x83�H\x89}�H�u�H�E�H���\x9e���H\x8b\x18H\x8bE�H���O\xb4��H\x89ƿ\b",
size_ = 0, capacity_ = 1441151880758558720)
}
  }
}

Thanks for any info regarding how to encode this string.

Jeffrey

On Sat, Mar 26, 2016 at 3:34 PM, Jeffrey Tan 
wrote:

> Follow-up for the previous question:
>
> Our python code is trying to call json.dumps to serialize the variable
> evaluation result into string block and send to IDE via RPC, however it
> failed with "UnicodeDecodeError: 'utf8' codec can't decode byte 0xc9 in
> position 10: invalid continuation byte" because SBValue.description seems
> to return non-utf-8 string:
>
> (lldb) fr v
> *error: biggrep_master_server_async 0x10b9a91a: DW_TAG_member
> '_M_pod_data' refers to type 0x10bb1e99 which extends beyond the bounds of
> 0x10b9a901*
> *error: biggrep_master_server_async 0x10b98edc: DW_TAG_member 'small_'
> refers to type 0x10bb1d9f which extends beyond the bounds of 0x10b98ed3*
> *error: biggrep_master_server_async 0x10baf034: DW_TAG_member '__size'
> refers to type 0x10baf04d which extends beyond the bounds of 0x10baefae*
> (facebook::biggrep::BigGrepMasterAsync *) this = 0x7fd14d374fd0
> (const string &const) corpus = error: summary string parsing error: {
>   store_ = {
>  = {
>   small_ = {}
>   *ml_ = (data_ =
> "��UH\x89�H�}�H\x8bE�]ÐUH\x89�H��H\x89}�H\x8bE�H\x89��~\xb4��\x90��UH\x89�SH\x83�H\x89}�H�u�H�E�H���\x9e���H\x8b\x18H\x8bE�H���O\xb4��H\x89ƿ\b",
> size_ = 0, capacity_ = 1441151880758558720)*
> }
>   }
> }
>
>
> File
> "/data/users/jeffreytan/fbsource/fbobjc/Tools/Nuclide/pkg/nuclide-debugger-lldb-server/scripts/chromedebugger.py",
> line 91, in received_message
> *response_in_json = json.dumps(response);*
>   File "/usr/lib64/python2.6/json/__init__.py", line 230, in dumps
> return _default_encoder.encode(obj)
>   File "/usr/lib64/python2.6/json/encoder.py", line 367, in encode
> chunks = list(self.iterencode(o))
>   File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode
> for chunk in self._iterencode_dict(o, markers):
>   File "/usr/lib64/python2.6/json/encoder.py", line 275, in
> _iterencode_dict
> for chunk in self._iterencode(value, markers):
>   File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode
> for chunk in self._iterencode_dict(o, markers):
>   File "/usr/lib64/python2.6/json/encoder.py", line 275, in
> _iterencode_dict
> for chunk in self._iterencode(value, markers):
>   File "/usr/lib64/python2.6/json/encoder.py", line 306, in _iterencode
> for chunk in self._iterencode_list(o, markers):
>   File "/usr/lib64/python2.6/json/encoder.py", line 204, in
> _iterencode_list
> for chunk in self._iterencode(value, markers):
>   File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode
> for chunk in self._iterencode_dict(o, markers):
>   File "/usr/lib64/python2.6/json/encoder.py", line 275, in
> _iterencode_dict
> for chunk in self._iterencode(value, markers):
>   File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode
> for chunk in self._iterencode_dict(o, markers):
>   File "/usr/lib64/python2.6/json/encoder.py", line 275, in
> _iterencode_dict
> for chunk in self._iterencode(value, markers):
>   File "/usr/lib64/python2.6/json/encoder.py", line 294, in _iterencode
> yield encoder(o)
> *UnicodeDecodeError: 'utf8' codec can't decode byte 0xc9 in position 10:
> invalid continuation byte*
>
> Question:
> Is the non utf-8 string expected or just gabage data because of the 
> DW_TAG_member
> error? What is the proper way find out the string encoding and serialize
> using* json.dumps()*?
>
> Jeffrey
>
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev