Yeah I saw the TCPSocket::Accept differences on top of tree after sending the email - unfortunately I have to do my work on this old branch, and I can't build our top of tree sources for the iphone target right now. It's not a great situation!
The min/max ports are needed for running the testsuite on a device, we run a program to relay a specific list of ports between the mac and the phone so we have to depend on that. I'll see if I can get the testsuite running on my local mac via lldb-server platform on top of tree and see what changes are needed there, hopefully I can repo the same problems I'm hitting on this old branch if they're still applicable. > On Feb 5, 2018, at 2:56 AM, Pavel Labath <lab...@google.com> wrote: > > Hi Jason, > > Thanks for the heads up. I look forward to getting rid of fork() > there, but there is one thing that's not clear to me. You say that > TCPSocket::Accept() calls CloseListenSockets().. I see don't see > anything like that in the current code, and I know for a fact that we > are able to handle multiple (concurrent) connections, as that is how > we run tests on android right now. Could this be some problem with > the branch that you were working on? > > In general, if you don't need to use the --min-port/--max-port > arguments (that is indeed broken in the fork mode), then everything > should already be set-up for the parallel remote testing. In fact, > unless you have a hard requirement to use --min/max-port then I would > advise against it. I think it can cause you a lot of problems down the > line. The issue with vending out ports like this is that the > "PortCoordinator" does not have control over the whole system, and it > cannot prevent some other process from claiming one of the ports that > it "owns". So, you may need to implement some retry logic to handle > the case when debugserver cannot bind to the port that it was assigned > to. > > I think that a much more reliable and simpler solution would be to > have debugserver ask the OS to assign it an unused port (by binding to > port zero), and then report that port back to lldb-platform (via the > --named-pipe arg -- this is how we use llgs right now). Besides fixing > the port allocation problem, this will also ensure that by the time > the platform reports the port back to the client, debugserver is > already primed and waiting for a connection. I'm not sure if > debugserver right now supports the --named-pipe argument, but it > should be fairly trivial to add it if needed. > > Of course, if you need to the min/max-port mode of operation then > that's fine, but if this were up to me, i'd try hard to find a way to > avoid it. :) > > regards, > pavel > > On 3 February 2018 at 04:12, Jason Molenda <jmole...@apple.com> wrote: >> Hi Pavel, I closed this phabracator, I've been working on this on-and-off >> and I think a different approach would be better. >> >> I am working up a patch where I change tools/lldb-server/lldb-platform.cpp >> from using a fork() model to using std::thread's. The main thread listens >> for incoming connections, and when one is received, it starts up a new >> std::thread to handle that connection. I have a PortCoordinator singleton >> object that manages the ports we can use for communications. Newly created >> threads request ports from (& free them when the thread is finished) so that >> it would be possible to run multiple tests at the same time. >> >> The current code fork()'s when it receives a connection, and gives the child >> process a copy of all the ports that are available. Because it's in a >> separate process, if we were to receive a second connection and fork off a >> new process, it would have the same list of ports listed as available. When >> debugserver is being used, this is a problem - when the lldb-server platform >> thread/process gets a qLaunchGDBServer packet, we run debugserver saying >> "hey debugserver listen for a connection on port N" and then tell the remote >> lldb process "there's a debugserver waiting to hear from you on port N" - so >> lldb-server can't test whether port N is in use or not. >> >> (there was also a problem in GDBRemoteCommunication::StartDebugserverProcess >> where it has a url like localhost:1111 and then it tries to run debugserver >> in --named-pipe mode even though we already have a port # to use in the url.) >> >> The final problem is that TCPSocket::Accept waits for an incoming connection >> on the assigned port #, and when it comes in it gets a new file descriptor >> for that connection. It should resume listening to that assigned port for >> the next connection that comes in ... but today TCPSocket::Accept calls >> CloseListenSockets() so after opening the first lldb-server platform >> connection that comes in, when we go to accept() the second, the socket has >> been closed and we exit immediately. To quickly recap the POSIX API (I >> never do network programming so I have to remind myself of this every time), >> the workflow for a server like this usually looks like >> >> parentfd = socket(AF_INET, SOCK_STREAM, 0); >> optval = 1; >> setsockopt(parentfd, SOL_SOCKET, SO_REUSEADDR, >> (const void *)&optval , sizeof(int)); >> serveraddr.sin_family = AF_INET; >> serveraddr.sin_addr.s_addr = htonl(INADDR_ANY); >> serveraddr.sin_port = htons((unsigned short)portno); >> bind(parentfd, (struct sockaddr *) &serveraddr, sizeof(serveraddr); >> listen(parentfd, 100); // allow 100 connections to queue up -- whatever >> childfd = accept(parentfd, (struct sockaddr *) &clientaddr, &clientlen); >> >> and then we go back to accept()'ing on parentfd after we've spun off >> something to talk to childfd. >> >> >> I've been doing all of my work on an old branch for reasons that are boring, >> so some/all of this may be fixed on top of tree already! But this is what I >> had to do to get my branch to work correctly on a remote system. >> >> I also noticed that dotest.py won't run multiple debug sessions with a >> remote target. That was part of my goal, to speed these up, but it seems to >> run in --no-multiprocess mode currently. >> >> >> I'll be picking this up next week - my changes right now are a mess, and >> they're against this old branch that I have to work on, but I'll get them >> cleaned up & updated to top of tree and make up a patchset. But I wanted to >> give you a heads up on where I'm headed as this touches a lot of your code. >> >> Thanks! >> >> >> J >> >> >> >>> On Jan 18, 2018, at 3:44 AM, Pavel Labath via Phabricator >>> <revi...@reviews.llvm.org> wrote: >>> >>> labath added a comment. >>> >>> In https://reviews.llvm.org/D42210#979608, @jasonmolenda wrote: >>> >>>> Jim remembers some problem with logging across a fork()'ed process, Pavel >>>> does this ring a bell? This change might be completely bogus -- but it's >>>> very difficult to debug the child side of an lldb-server platform today. >>> >>> >>> I believe Jim is thinking of https://reviews.llvm.org/D38938. The issue is >>> that if another thread holds the logging mutex while we fork(), the mutex >>> will remain in a bogus state in the child process. This would mean that any >>> operation on the Log subsystem (such as enabling logging) could hang. We >>> hold the mutex for just a couple of instructions (just enough to copy a >>> shared_ptr), but after some code reshuffles, we were hitting this case very >>> often in liblldb. >>> >>> Now, I don't think this can happen here as at the point where we are >>> forking, the platform process is single-threaded. So, enabling logging here >>> should be safe, but only accidentally. It should be possible to make this >>> always safe, but that will need to be done with a very steady hand. My >>> opinion is we should not bother with that (i.e., just check this in as-is) >>> until we actually need more threads in the platform, particularly as I >>> think the long-term direction here should be to replace the fork with a new >>> thread for handling the connection. >>> >>> As for testing, the problem we have now is that we have no direct platform >>> tests today -- remote tests will test platform by the virtue of all >>> connections going through it but a standard check-lldb will not even run >>> this code. I have been planning to add tests like these, but I haven't got >>> around to that yet, and I'm not sure when will that happen. If you want to >>> take a stab at it, I can give you some pointers. >>> >>> BTW: for most debugging needs you should be able to just start the platform >>> without the --server argument, which will disable forking and handle each >>> connection in the main process. >>> >>> >>> Repository: >>> rL LLVM >>> >>> https://reviews.llvm.org/D42210 >>> >>> >>> >> _______________________________________________ lldb-commits mailing list lldb-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits