Re: Managing userland data pointers in kqueue/kevent

Eugen-Andrei Gavriloaie Mon, 13 May 2013 11:16:08 -0700

------
Eugen-Andrei Gavriloaie
Web: http://www.rtmpd.com


On May 13, 2013, at 9:02 PM, Adrian Chadd <[email protected]> wrote:

> Hi,
> 
> The reason I tend to suggest this is for portability and debugging
> reasons. (Before and even since libevent came into existence.)
> 
> If you do it right, you can stub / inline out all of the wrapper
> functions in userland and translate them to straight system or library
> calls.
> 
> Anyway. I'm all for making kqueue better. I just worry that adding
> little hacks here and there isn't the right way to do it. If you want
> to guarantee specific behaviours with kqueue, you should likely define
> how it should work in its entirety and see if it will cause
> architectural difficulties down the track.
And it caused some so far. We have workarounds for it, no problem.

> Until that is done, I think
> you have no excuse to get your code working as needed.
Yes. I agree. But when I look at the user space code without that feature, and 
when thinking how it would have been with that feature, it kinda makes me cry. 
A little more pain and I will make that patch myself. I'm just hoping that kq 
kernel side code will be handled by more capable hands before me. Ideally, by 
the creators.

> 
> Don't blame kqueue because what (iirc) is not defined behaviour isn't
> defined in a way that makes you happy :)
Nobody blamed kqueue. I'm just saying that it would be better for me (and I'm 
not the only one) who could use a liiiitle more help from it. It was born from 
needs, it evolved because of needs, why stop now? I dare to say it will become 
a standard on linux and other OSs very soon. It is the best fd reactor. Hands 
down!

Best regards,
Andrei

> 
> 
> 
> Adrian
> 
> On 13 May 2013 09:36, Eugen-Andrei Gavriloaie <[email protected]> wrote:
>> Hi Adrian,
>> 
>> All the tricks, work arounds, paradigms suggested/implemented by us, the kq 
>> users, are greatly simplified by simply adding that thing that Paul is 
>> suggesting. What you are saying here is to basically do not-so-natural 
>> things to overcome a real problem which can be very easy and non-intrusivly 
>> solved at lower levels. Seriously, if you truly believe that you can put the 
>> equal sign between the complexity of the user space code and the wanted 
>> patch in kqueue kernel side, than I simply shut up.
>> 
>> Besides, one of the important points in kq philosophy is simplifying things. 
>> I underline the "one of". It is not the goal, of course. Complex things are 
>> complex things no matter how hard you try to simplify them. But this is 
>> definitely (should) not falling into that category.
>> 
>> ------
>> Eugen-Andrei Gavriloaie
>> Web: http://www.rtmpd.com
>> 
>> On May 13, 2013, at 6:47 PM, Adrian Chadd <[email protected]> wrote:
>> 
>>> ... holy crap.
>>> 
>>> On 13 May 2013 08:37, Eugen-Andrei Gavriloaie <[email protected]> wrote:
>>>> Hi,
>>>> 
>>>> Well, Paul already asked this question like 3-4 times now. Even insisting 
>>>> on it. I will also ask it again:
>>>> If user code is responsible of tracking down the data associated with the 
>>>> signalled entity, what is the point of having user data?
>>>> Is rendered completely useless…
>>> 
>>> .. why does everything have to have a well defined purpose that is
>>> also suited for use in _all_ situations?
>> That is called perfection. I know we can't achieve it, but I like to walk in 
>> that direction at least.
>> 
>>> 
>>>> Not to mention, that your suggestion with FD index is a definite no-go. 
>>>> The FD values are re-used. Especially in MT environments. Imagine one 
>>>> kqueue call taking place in thread A and another one in thread B. Both 
>>>> threads waiting for events.
>>> 
>>> .. so don't do that. I mean, you're already having to write your code
>>> to _not_ touch FDs in other threads. I've done this before, it isn't
>>> that hard and it doesn't hurt performance.
>> Why not? This is how you achieve natural load balancing for multiple 
>> kevent() calls from multiple threads over the same kq fd. Otherwise, again, 
>> you have to write complex code to manually balance the threads. That brings 
>> locking again….
>> Why people always think that locking is cheap? Excessive locking hurts. A 
>> lot!
>> 
>>> 
>>>> When A does his magic, because of internal business rules, it decides to 
>>>> close FD number 123. It closes it and it connects somewhere else by 
>>>> opening a new one. Surprise, we MAY  get the value 123 again as a new 
>>>> socket, we put it on our index, etc. Now, thread B comes in and it has 
>>>> stale/old events for the old 123 FD. Somethings bad like EOF for the OLD 
>>>> version of FD number 123 (the one we just closed anyway). Guess what… 
>>>> thread B will deallocate the perfectly good thingy inside the index 
>>>> associated with 123.
>>> 
>>> So you just ensure that nothing at all calls a close(123); but calls
>>> fd_close(123) which will in turn close(123) and free all the state
>>> associated with it.
>> Once threads A and B returned from their kevent() calls, all bets are off. 
>> In between, you get the the behaviour I just described from threads A and B 
>> racing towards FD123 to either close it or create a new one. How is wrapping 
>> close() going to help? Is not like you have any control over what the 
>> socket() function is going to return. (That gave me another token idea btw… 
>> I will explain in another email, perhaps you care to comment)
>> Mathematically speaking, the fd-to-data association is not bijective.
>> 
>> 
>>> 
>>> You have fd_close() either grab a lock, or you ensure that only the
>>> owning thread can call fd_close(123) and if any other thread calls it,
>>> the behaviour is undefined.
>> As I said, that adds up to the user-space code complexity. Just don't forget 
>> that Paul's suggestion solves all this problems in a ridiculously simple 
>> manner. All our ideas of keeping track who is owning who and indexes are 
>> going to be put to rest. kq will notify us when the udata is out of scope 
>> from kq perspective. That is all we ask.
>> 
>>> 
>>>> And regarding the "thread happiness", that is not happiness at all IMHO…
>>> 
>>> Unless you're writing a high connection throughput web server, the
>>> overhead of grabbing a lock in userland during the fd shutdown process
>>> is trivial. Yes, I've written those. It doesn't hurt you that much.
>> That "that much" is subjective. And a streaming server is a few orders of 
>> magnitude more complex than a web server. Remember, a web server is bound to 
>> request/response paradigm. While a streaming server is a full duplex (not 
>> request/response based) animal for most of connections. I strongly believe 
>> that becomes a real problem. (I would love to be wrong on this one!)
>> 
>>> 
>>> I'm confused as to why this is still an issue. Sure, fix the kqueue
>>> semantics and do it in a way that doesn't break backwards
>>> compatibility.
>> Than, if someone has time and pleasure, it would be nice to have it. Is a 
>> neat solution. Is one thing saying, hey, we don't have time, do it yourself. 
>> And another thing in trying to offer "better" solutions by defending such an 
>> obvious caveat.
>> 
>>> But please don't claim that it's stopping you from
>>> getting real work done.
>> I didn't and I won't. I promise!
>> 
>>> I've written network apps with kqueue that
>>> scales to 8+ cores and (back in mid-2000's) gigabit + of small HTTP
>>> transactions.
>> Good for you. How is this relevant to or discussion of simplifying things? 
>> Of course is possible. But let's make things simpler and more efficient. It 
>> really pays off in the long run. Hell, this is how kq was born in the first 
>> place: getting rid of all garbage that one was supposed to do to achieve 
>> what kq does with a few lines of code. Let's make that even better than it 
>> currently is.
>> 
>>> This stuff isn't at all problematic.
>>> 
>>> 
>>> Adrian
>>

smime.p7s
Description: S/MIME cryptographic signature

Re: Managing userland data pointers in kqueue/kevent

Reply via email to