Re: Streamio and select

Samuel Thibault Wed, 03 Sep 2025 11:54:50 -0700

yelninei--- via Bug reports for the GNU Hurd, le mer. 03 sept. 2025 11:19:39 
+0200, a ecrit:
> Sep 1, 2025, 19:10 by [email protected]:
> > yelninei--- via Bug reports for the GNU Hurd, le lun. 01 sept. 2025 
> > 14:36:32 +0200, a ecrit:
> >
> >> Trying to use streamio with O_NONBLOCK and select results in select 
> >> claiming that the fd is readable but when trying to read returning 
> >> EWOULDBLOCK.
> >>
> >>
> >> Removing the O_NONBLOCK check in io_select_common makes select behave  (I 
> >> tried with timeout of NULL, 0 and 10s)
> >>
> >
> > This O_NONBLOCK check indeed looks wrong: select() is not supposed to be
> > affected by O_NONBLOCK, and just block. Does your system work fine with
> > that change? We should probably land it, patch welcome!
> >
> 
> I have only tested the read part. When data is available (after the first 
> failing read) or with a or no timeout it works as I would expect.
> 
> Removing the check changes it from claiming that (with NONBLOCK) it is 
> readable immediately to never. I guess this is more accurate as reading 
> indeed WOULDBLOCK but not really ideal unless the empty 
> input_buffer/output_buffer case is handled.


We'll want to fix everything anyway.

> >> however the input_buffer stays empty forever so it is never woken up. Any 
> >> ideas?
> >>
> > Does device_read_reply_inband get called with errorcode == D_WOULD_BLOCK?
> > I wonder if in that case we'd be supposed to call start_input again, to
> > make another device_read request. If that second request gets
> > D_WOULD_BLOCK immediately, there is something wrong in the kernel driver
> > itself that doesn't manage to let us block.

It's very hard to follow because the references you use can be
ambiguous..

> I have never questioned the EWOULDBLOCK on the first read,

What do you mean by "questioned"? Which "first read" do you mean? Is
your program calling read() before calling select()?

> this was also happening before he previous change already and looks expected 
> (although I admit it is a bit weird)?

If there is nothing to read, it's indeed normal to get EWOULDBLOCK on a
O_NONBLOCK file descriptor.

> After a start_input with empty input_buffer and nowait dev_read returns 
> EWOULDBLOCK as no messages are (immediately) available.

Does the device_read_request_inband() call inside start_input() return
D_WOULD_BLOCK? Does device_read_reply_inband get called with errorcode
== D_WOULD_BLOCK?

> After that it is readable

After what? From what I read above dev_read was returning EWOULDBLOCK.

> until the "end"

By "end", do you mean the current end of file?

> where it now also EWOULDBLOCKs as opposed to D_WOULD_BLOCK

What do you mean by "as opposed?" Normally userland should only ever see
EWOULDBLOCK, and we would convert D_WOULD_BLOCK into EWOULDBLOCK. Where
do you get D_WOULD_BLOCK vs EWOULDBLOCK exactly?

> (unless one exactly manages to empty the input_buffer, i.e. by reading in 
> chunks of 1).

Do you mean that if one call io_read() with more than 1, we might get
stuck in a situation where we don't manage to empty the pending
characters?

> If there was a D_WOULD_BLOCK

Where do you mean there could be a D_WOULD_BLOCK?

> it would have failed with D_WOULD_BLOCK much earlier than the "end".

AIUI we wouldn't want to get any D_WOULD_BLOCK before the actual end of
file?

> I tried to add a start_input/start_output to request data if none is available

Where did you add it exactly?

> but it did not work and select (with NULL timeout) was still waiting.

Waiting, while there is still data to read, you mean?

> Should this be before or after the pthread_hurd_cond_timedwait_np?

By:

> > device_read_reply_inband get called with errorcode == D_WOULD_BLOCK?
> > I wonder if in that case we'd be supposed to call start_input again,

I didn't mean to add calling start_input() in io_select_common,
but in device_read_reply_inband(), in the case where errorcode ==
D_WOULD_BLOCK. Because we are consuming the opportunity of reading data
from the device, so we'd have to trigger another one by calling
start_input() to make another device_read RPC call.

> Also do we need to acquire the mutex_lock on every iteration of the loop?

We have to keep the mutex acquired whenever we act on the global data
such as the input_buffer, etc.

Samuel

Re: Streamio and select

Reply via email to