Re: Managing userland data pointers in kqueue/kevent

Eugen-Andrei Gavriloaie Mon, 13 May 2013 09:37:09 -0700

Hi Adrian,

All the tricks, work arounds, paradigms suggested/implemented by us, the kq 
users, are greatly simplified by simply adding that thing that Paul is 
suggesting. What you are saying here is to basically do not-so-natural things 
to overcome a real problem which can be very easy and non-intrusivly solved at 
lower levels. Seriously, if you truly believe that you can put the equal sign 
between the complexity of the user space code and the wanted patch in kqueue 
kernel side, than I simply shut up.


Besides, one of the important points in kq philosophy is simplifying things. I 
underline the "one of". It is not the goal, of course. Complex things are 
complex things no matter how hard you try to simplify them. But this is 
definitely (should) not falling into that category.

------
Eugen-Andrei Gavriloaie
Web: http://www.rtmpd.com

On May 13, 2013, at 6:47 PM, Adrian Chadd <[email protected]> wrote:

> ... holy crap.
> 
> On 13 May 2013 08:37, Eugen-Andrei Gavriloaie <[email protected]> wrote:
>> Hi,
>> 
>> Well, Paul already asked this question like 3-4 times now. Even insisting on 
>> it. I will also ask it again:
>> If user code is responsible of tracking down the data associated with the 
>> signalled entity, what is the point of having user data?
>> Is rendered completely useless…
> 
> .. why does everything have to have a well defined purpose that is
> also suited for use in _all_ situations?
That is called perfection. I know we can't achieve it, but I like to walk in 
that direction at least.

> 
>> Not to mention, that your suggestion with FD index is a definite no-go. The 
>> FD values are re-used. Especially in MT environments. Imagine one kqueue 
>> call taking place in thread A and another one in thread B. Both threads 
>> waiting for events.
> 
> .. so don't do that. I mean, you're already having to write your code
> to _not_ touch FDs in other threads. I've done this before, it isn't
> that hard and it doesn't hurt performance.
Why not? This is how you achieve natural load balancing for multiple kevent() 
calls from multiple threads over the same kq fd. Otherwise, again, you have to 
write complex code to manually balance the threads. That brings locking again….
Why people always think that locking is cheap? Excessive locking hurts. A lot!

> 
>> When A does his magic, because of internal business rules, it decides to 
>> close FD number 123. It closes it and it connects somewhere else by opening 
>> a new one. Surprise, we MAY  get the value 123 again as a new socket, we put 
>> it on our index, etc. Now, thread B comes in and it has stale/old events for 
>> the old 123 FD. Somethings bad like EOF for the OLD version of FD number 123 
>> (the one we just closed anyway). Guess what… thread B will deallocate the 
>> perfectly good thingy inside the index associated with 123.
> 
> So you just ensure that nothing at all calls a close(123); but calls
> fd_close(123) which will in turn close(123) and free all the state
> associated with it.
Once threads A and B returned from their kevent() calls, all bets are off. In 
between, you get the the behaviour I just described from threads A and B racing 
towards FD123 to either close it or create a new one. How is wrapping close() 
going to help? Is not like you have any control over what the socket() function 
is going to return. (That gave me another token idea btw… I will explain in 
another email, perhaps you care to comment)
Mathematically speaking, the fd-to-data association is not bijective. 


> 
> You have fd_close() either grab a lock, or you ensure that only the
> owning thread can call fd_close(123) and if any other thread calls it,
> the behaviour is undefined.
As I said, that adds up to the user-space code complexity. Just don't forget 
that Paul's suggestion solves all this problems in a ridiculously simple 
manner. All our ideas of keeping track who is owning who and indexes are going 
to be put to rest. kq will notify us when the udata is out of scope from kq 
perspective. That is all we ask.

> 
>> And regarding the "thread happiness", that is not happiness at all IMHO…
> 
> Unless you're writing a high connection throughput web server, the
> overhead of grabbing a lock in userland during the fd shutdown process
> is trivial. Yes, I've written those. It doesn't hurt you that much.
That "that much" is subjective. And a streaming server is a few orders of 
magnitude more complex than a web server. Remember, a web server is bound to 
request/response paradigm. While a streaming server is a full duplex (not 
request/response based) animal for most of connections. I strongly believe that 
becomes a real problem. (I would love to be wrong on this one!)

> 
> I'm confused as to why this is still an issue. Sure, fix the kqueue
> semantics and do it in a way that doesn't break backwards
> compatibility.
Than, if someone has time and pleasure, it would be nice to have it. Is a neat 
solution. Is one thing saying, hey, we don't have time, do it yourself. And 
another thing in trying to offer "better" solutions by defending such an 
obvious caveat. 

> But please don't claim that it's stopping you from
> getting real work done.
I didn't and I won't. I promise!

> I've written network apps with kqueue that
> scales to 8+ cores and (back in mid-2000's) gigabit + of small HTTP
> transactions.
Good for you. How is this relevant to or discussion of simplifying things? Of 
course is possible. But let's make things simpler and more efficient. It really 
pays off in the long run. Hell, this is how kq was born in the first place: 
getting rid of all garbage that one was supposed to do to achieve what kq does 
with a few lines of code. Let's make that even better than it currently is.

> This stuff isn't at all problematic.
> 
> 
> Adrian

smime.p7s
Description: S/MIME cryptographic signature

Re: Managing userland data pointers in kqueue/kevent

Reply via email to