Hello Eugene, this sounds troubling, and I'd like to get to the bottom of this. If you can get me a bit more information about this, I believe we can figure it out:
- could you get a backtrace of lldb-server when it is in the "stuck" state (just attach with lldb/gdb after it hangs and do "bt")? I want to see the where is it spinning, as I don't see any obvious infinite loop there. - are you able to still reproduce the bug with logging enabled? If so, I'd like to see the log file to understand this better. (You can enable logging by starting lldb-server with: --log-file XXX.log --log-channels "lldb all:linux all". If you're starting it via lldb client you can set the LLDB_DEBUGSERVER_LOG_FILE and LLDB_SERVER_LOG_CHANNELS environment vars to achieve this) - If you can get me reasonably detailed repro steps, I can try to investigate (I am fine with the first step being "install ubuntu 16.04 in virtualbox") On 6 December 2016 at 23:41, Eugene Birukov via lldb-dev <lldb-dev@lists.llvm.org> wrote: > Hi, > 1. I believe that lldb-server spins inside ptrace. I put breakpoint on the > highlighted line, and it does not hit. If I put breakpoint on line before, > it hits but lldb-server hangs. Do you mean actually inside the ptrace(2) syscall? Your description would certainly fit that, but that sounds scary, as it would mean a kernel bug. If that's the case, then we have to start looking in the kernel. I have some experience with that, but If we can boil this down to a simple use case, we can also ask the kernel ptrace folks for help. > 2. It seems that hang is caused by the client trying to read response too > fast. I mean, if I step through the client code it works - i.e. there is > significant delay between client writing into pipe and issuing ::select to > wait for response. I am not sure how this fits in with the item above. I find it hard to believe that the presence of select(2) in one process would affect the outcome of ptrace() in another. Unless we are actually encountering a kernel scheduler bug, which I find unlikely. Hopefully we can get more insight here with additional logging information. > Any advice how to deal with the situation except putting random sleeps in > random places? Inserting sleeps in various places is a valid (albeit very slow) strategy for debugging races. If you can push the sleep down, you will eventually reach the place where it will be obvious what is racing (or, at least, which component is to blame). Hopefully we can do something smarter though. If you are suspecting a kernel bug, I'd recommend recreating it in a simple standalone application (fork, trace the child, write its memory), as then it is easy to ask for help pl _______________________________________________ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev