Sorry, I know we have a version of this fix in proposed now, but based on upstream feedback, we should switch to the following patch:
From: K. Y. Srinivasan [mailto:k...@microsoft.com] Sent: Tuesday, April 5, 2016 11:50 AM To: gre...@linuxfoundation.org; linux-ker...@vger.kernel.org; de...@linuxdriverproject.org; o...@aepfle.de; a...@canonical.com; vkuzn...@redhat.com; jasow...@redhat.com Cc: KY Srinivasan <k...@microsoft.com>; sta...@vger.kernel.org Subject: [PATCH 1/1] Drivers: hv: vmbus: Fix signaling logic in hv_need_to_signal_on_read() On the consumer side, we have interrupt driven flow management of the producer. It is sufficient to base the signaling decision on the amount of space that is available to write after the read is complete. The current code samples the previous available space and uses this in making the signaling decision. This state can be stale and is unnecessary. Since the state can be stale, we end up not signaling the host (when we should) and this can result in a hang. Fix this problem by removing the unnecessary check. I would like to thank Arseney Romanenko <arsen...@microsoft.com> for pointing out this issue. Also, issue a full memory barrier before making the signaling descision to correctly deal with potential reordering of the write (read index) followed by the read of pending_sz. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> Tested-by: Dexuan Cui <de...@microsoft.com> Cc: <sta...@vger.kernel.org> --- drivers/hv/ring_buffer.c | 26 ++++++++++++++++++++------ 1 files changed, 20 insertions(+), 6 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index 5613e2b..a40a73a 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -103,15 +103,29 @@ static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) * there is room for the producer to send the pending packet. */ -static bool hv_need_to_signal_on_read(u32 prev_write_sz, - struct hv_ring_buffer_info *rbi) +static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) { u32 cur_write_sz; u32 r_size; - u32 write_loc = rbi->ring_buffer->write_index; + u32 write_loc; u32 read_loc = rbi->ring_buffer->read_index; - u32 pending_sz = rbi->ring_buffer->pending_send_sz; + u32 pending_sz; + /* + * Issue a full memory barrier before making the signaling decision. + * Here is the reason for having this barrier: + * If the reading of the pend_sz (in this function) + * were to be reordered and read before we commit the new read + * index (in the calling function) we could + * have a problem. If the host were to set the pending_sz after we + * have sampled pending_sz and go to sleep before we commit the + * read index, we could miss sending the interrupt. Issue a full + * memory barrier to address this. + */ + mb(); + + pending_sz = rbi->ring_buffer->pending_send_sz; + write_loc = rbi->ring_buffer->write_index; /* If the other end is not blocked on write don't bother. */ if (pending_sz == 0) return false; @@ -120,7 +134,7 @@ static bool hv_need_to_signal_on_read(u32 prev_write_sz, cur_write_sz = write_loc >= read_loc ? r_size - (write_loc - read_loc) : read_loc - write_loc; - if ((prev_write_sz < pending_sz) && (cur_write_sz >= pending_sz)) + if (cur_write_sz >= pending_sz) return true; return false; @@ -455,7 +469,7 @@ int hv_ringbuffer_read(struct hv_ring_buffer_info *inring_info, /* Update the read index */ hv_set_next_read_location(inring_info, next_read_location); - *signal = hv_need_to_signal_on_read(bytes_avail_towrite, inring_info); + *signal = hv_need_to_signal_on_read(inring_info); return ret; } -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1556264 Title: [Hyper-V] vmbus: Fix a bug in hv_need_to_signal_on_read() Status in linux package in Ubuntu: Fix Released Status in linux-lts-vivid package in Ubuntu: In Progress Status in linux source package in Trusty: Fix Committed Status in linux-lts-vivid source package in Trusty: Fix Committed Status in linux source package in Wily: Fix Committed Status in linux source package in Xenial: Fix Released Bug description: The following patch has been submitted upstream in response to investigation of customer issues with network connections hanging under high load. On the consumer side, we have interrupt driven flow management of the producer. It is sufficient to base the signalling decision on the amount of space that is available to write after the read is complete. The current code samples the previous available space and uses this in making the signalling decision. This state can be stale and is unnecessary. Since the state can be stale, we end up not signalling the host (when we should) and this can result in a hang. Fix this problem by removing the unnecessary check. I would like to thank Arseney Romanenko <arsen...@microsoft.com> for pointing out this bug. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> Tested-by: Dexuan Cui <de...@microsoft.com> Cc: <sta...@vger.kernel.org> --- drivers/hv/ring_buffer.c | 7 +++---- 1 files changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index 5613e2b..085003a 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -103,8 +103,7 @@ static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) * there is room for the producer to send the pending packet. */ -static bool hv_need_to_signal_on_read(u32 prev_write_sz, - struct hv_ring_buffer_info *rbi) +static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) { u32 cur_write_sz; u32 r_size; @@ -120,7 +119,7 @@ static bool hv_need_to_signal_on_read(u32 prev_write_sz, cur_write_sz = write_loc >= read_loc ? r_size - (write_loc - read_loc) : read_loc - write_loc; - if ((prev_write_sz < pending_sz) && (cur_write_sz >= pending_sz)) + if (cur_write_sz >= pending_sz) return true; return false; @@ -455,7 +454,7 @@ int hv_ringbuffer_read(struct hv_ring_buffer_info *inring_info, /* Update the read index */ hv_set_next_read_location(inring_info, next_read_location); - *signal = hv_need_to_signal_on_read(bytes_avail_towrite, inring_info); + *signal = hv_need_to_signal_on_read(inring_info); return ret; } We have customers who are encountering this issue. Although this patch is not yet accepted upstream, we would like to get them a test kernel as soon as we can. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1556264/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp