On 10/02/2026 09:17, David Marchand wrote:
> Hello Kevin,
> 
> On Fri, 6 Feb 2026 at 18:21, Kevin Traynor <[email protected]> wrote:
>>
>> Add handling for epoll error and disconnect conditions EPOLLERR,
>> EPOLLHUP and EPOLLRDHUP.
>>
>> These events indicate that the interrupt file descriptor is in
>> an error state or there has been a hangup.
>>
>> Only do this for interrupts that are read in eal. Interrupts that
>> are read outside eal should deal with different interrupt scenarios
>> appropriate to their functionality. e.g. virtio interrupt handling
>> has reconnect mechanisms for some cases.
>>
>> Also, treat no bytes read as an error condition.
>>
>> Bugzilla ID: 1873
>> Fixes: af75078fece3 ("first public release")
>> Cc: [email protected]
> 
> Cc: Harman.
> 
>>
>> Signed-off-by: Kevin Traynor <[email protected]>
>> ---
>>  lib/eal/linux/eal_interrupts.c | 67 ++++++++++++++++++++++------------
>>  1 file changed, 44 insertions(+), 23 deletions(-)
>>
>> diff --git a/lib/eal/linux/eal_interrupts.c b/lib/eal/linux/eal_interrupts.c
>> index 9db978923a..68ca0f929e 100644
>> --- a/lib/eal/linux/eal_interrupts.c
>> +++ b/lib/eal/linux/eal_interrupts.c
>> @@ -887,4 +887,26 @@ rte_intr_disable(const struct rte_intr_handle 
>> *intr_handle)
>>  }
>>
>> +static void
>> +eal_intr_source_remove_and_free(struct rte_intr_source *src)
>> +{
>> +       struct rte_intr_callback *cb, *next;
>> +
>> +       /* Remove the interrupt source */
>> +       rte_spinlock_lock(&intr_lock);
>> +       TAILQ_REMOVE(&intr_sources, src, next);
>> +       rte_spinlock_unlock(&intr_lock);
>> +
>> +       /* Free callbacks */
>> +       for (cb = TAILQ_FIRST(&src->callbacks); cb; cb = next) {
>> +               next = TAILQ_NEXT(cb, next);
>> +               TAILQ_REMOVE(&src->callbacks, cb, next);
>> +               free(cb);
>> +       }
>> +
>> +       /* Free the interrupt source */
>> +       rte_intr_instance_free(src->intr_handle);
>> +       free(src);
>> +}
>> +
>>  static int
>>  eal_intr_process_interrupts(struct epoll_event *events, int nfds)
>> @@ -952,4 +974,16 @@ eal_intr_process_interrupts(struct epoll_event *events, 
>> int nfds)
>>
>>                 if (bytes_read > 0) {
>> +                       /**
>> +                        * Check for epoll error or disconnect events for
>> +                        * interrupts that are read directly in eal.
>> +                        */
>> +                       if (events[n].events & (EPOLLERR | EPOLLHUP | 
>> EPOLLRDHUP)) {
>> +                               EAL_LOG(INFO, "Disconnect condition on fd %d 
>> "
> 
> This is an anormal situation, I would make this log level the same as
> other logs below.
> 
> The fact that the interrupt fd gets into this state should be
> something to report and investigate.
> 

ok. I'll change to warning.

> 
>> +                                       "(events=0x%x), removing from epoll",
>> +                                       events[n].data.fd, events[n].events);
>> +                               eal_intr_source_remove_and_free(src);
>> +                               return -1;
>> +                       }
>> +
>>                         /**
>>                          * read out to clear the ready-to-be-read flag
>> @@ -957,5 +991,7 @@ eal_intr_process_interrupts(struct epoll_event *events, 
>> int nfds)
>>                          */
>>                         bytes_read = read(events[n].data.fd, &buf, 
>> bytes_read);
>> -                       if (bytes_read < 0) {
>> +                       if (bytes_read > 0) {
>> +                               call = true;
>> +                       } else if (bytes_read < 0) {
>>                                 if (errno == EINTR || errno == EWOULDBLOCK)
>>                                         continue;
>> @@ -965,27 +1001,12 @@ eal_intr_process_interrupts(struct epoll_event 
>> *events, int nfds)
>>                                         events[n].data.fd,
>>                                         strerror(errno));
>> -                               /*
>> -                                * The device is unplugged or buggy, remove
>> -                                * it as an interrupt source and return to
>> -                                * force the wait list to be rebuilt.
>> -                                */
>> -                               rte_spinlock_lock(&intr_lock);
>> -                               TAILQ_REMOVE(&intr_sources, src, next);
>> -                               rte_spinlock_unlock(&intr_lock);
>> -
>> -                               for (cb = TAILQ_FIRST(&src->callbacks); cb;
>> -                                                       cb = next) {
>> -                                       next = TAILQ_NEXT(cb, next);
>> -                                       TAILQ_REMOVE(&src->callbacks, cb, 
>> next);
>> -                                       free(cb);
>> -                               }
>> -                               rte_intr_instance_free(src->intr_handle);
>> -                               free(src);
>> -                               return -1;
>> -                       } else if (bytes_read == 0)
>> -                               EAL_LOG(ERR, "Read nothing from file "
>> +                       } else { /* bytes == 0 */
> 
> "bytes_read == 0", or remove this comment as the code is quite compact
> and leaves little space for wondering what this else block is about.
> 

Ack. I will take it as a compliment and remove the comment ;-)

> 
>> +                               EAL_LOG(WARNING, "Read nothing from file "
> 
> I would keep this log at the same level than the < 0 condition.
> It seems the same type of error.
> 
>>                                         "descriptor %d", events[n].data.fd);
> 
> And avoid splitting the format string.
> 

Ack.

> 
>> -                       else
>> -                               call = true;
>> +                       }
>> +                       if (bytes_read <= 0) {
>> +                               eal_intr_source_remove_and_free(src);
>> +                               return -1;
>> +                       }
>>                 }
>>
>> --
>> 2.52.0
>>
> 
> Except those nits, the fix looks correct.
> 
> Acked-by: David Marchand <[email protected]>
> 


Thanks David. I will make these changes in v3.

> 
> 
> 

Reply via email to