On 7/2/19 3:12 PM, Gerd Rausch wrote:
On 02/07/2019 14.18, [email protected] wrote:
On 7/2/19 2:05 PM, Gerd Rausch wrote:
What do you call "RDS_GET_MR" semantics?

Its a blocking socket call. Meaning after this call return to the
user, the key must be valid. With async registration that can't be
guaranteed.


If the "IB_WR_REG_MR" operation does not complete successfully within
the given (to-be-discussed?) timeout, "rds_ib_post_reg_frmr" will return
"-EBUSY".

And that should propagate up the entire stack and make its way into
"setsockopt" returning "-1" with "errno == EBUSY".

This is an easy case and this doesn't need any waiting since call just
came back without posting work request.

Do you see a problem with this approach?
Did you observe a situation where this did not work?

Calling rds_ib_post_reg_frmr() and looking at return value doesn't
grantee that the work request postred is gping to be successful.

Are you saying that no timeout, no matter how large, is large enough?
If that's the case, we can consider turning the "wait_event_timeout"
into a "wait_event".

Yep. Basically till the ceq handler reports successful completion of
reg_mr or inval_mr, mr is not guaranteed to be registered or
invalidated.

Are you suggesting to
a) Not fix this bug right now and wait until some later point in time
When did I say that ? I said have you explored alternate approach to
fix the issue and if not could you try it out.


Why explore an alternate approach?
Do you see a problem with the proposed patch (other than the choice of timeout)?

Yes the timeout based proceeding isn't safe. wait_event without timeout
would make it guaranteed and give sync like behavior. This is the
behavior with FMR reg and inval calls. If you make that as
wait_event then am fine with the change.

Regards,
Santosh

Reply via email to