yelninei--- via Bug reports for the GNU Hurd, le mer. 03 sept. 2025 11:19:39 +0200, a ecrit: > Sep 1, 2025, 19:10 by [email protected]: > > yelninei--- via Bug reports for the GNU Hurd, le lun. 01 sept. 2025 > > 14:36:32 +0200, a ecrit: > > > >> Trying to use streamio with O_NONBLOCK and select results in select > >> claiming that the fd is readable but when trying to read returning > >> EWOULDBLOCK. > >> > >> > >> Removing the O_NONBLOCK check inĀ io_select_common makes select behaveĀ (I > >> tried with timeout of NULL, 0 and 10s) > >> > > > > This O_NONBLOCK check indeed looks wrong: select() is not supposed to be > > affected by O_NONBLOCK, and just block. Does your system work fine with > > that change? We should probably land it, patch welcome! > > > > I have only tested the read part. When data is available (after the first > failing read) or with a or no timeout it works as I would expect. > > Removing the check changes it from claiming that (with NONBLOCK) it is > readable immediately to never. I guess this is more accurate as reading > indeed WOULDBLOCK but not really ideal unless the empty > input_buffer/output_buffer case is handled.
We'll want to fix everything anyway. > >> however the input_buffer stays empty forever so it is never woken up. Any > >> ideas? > >> > > Does device_read_reply_inband get called with errorcode == D_WOULD_BLOCK? > > I wonder if in that case we'd be supposed to call start_input again, to > > make another device_read request. If that second request gets > > D_WOULD_BLOCK immediately, there is something wrong in the kernel driver > > itself that doesn't manage to let us block. It's very hard to follow because the references you use can be ambiguous.. > I have never questioned the EWOULDBLOCK on the first read, What do you mean by "questioned"? Which "first read" do you mean? Is your program calling read() before calling select()? > this was also happening before he previous change already and looks expected > (although I admit it is a bit weird)? If there is nothing to read, it's indeed normal to get EWOULDBLOCK on a O_NONBLOCK file descriptor. > After a start_input with empty input_buffer and nowait dev_read returns > EWOULDBLOCK as no messages are (immediately) available. Does the device_read_request_inband() call inside start_input() return D_WOULD_BLOCK? Does device_read_reply_inband get called with errorcode == D_WOULD_BLOCK? > After that it is readable After what? From what I read above dev_read was returning EWOULDBLOCK. > until the "end" By "end", do you mean the current end of file? > where it now also EWOULDBLOCKs as opposed to D_WOULD_BLOCK What do you mean by "as opposed?" Normally userland should only ever see EWOULDBLOCK, and we would convert D_WOULD_BLOCK into EWOULDBLOCK. Where do you get D_WOULD_BLOCK vs EWOULDBLOCK exactly? > (unless one exactly manages to empty the input_buffer, i.e. by reading in > chunks of 1). Do you mean that if one call io_read() with more than 1, we might get stuck in a situation where we don't manage to empty the pending characters? > If there was a D_WOULD_BLOCK Where do you mean there could be a D_WOULD_BLOCK? > it would have failed with D_WOULD_BLOCK much earlier than the "end". AIUI we wouldn't want to get any D_WOULD_BLOCK before the actual end of file? > I tried to add a start_input/start_output to request data if none is available Where did you add it exactly? > but it did not work and select (with NULL timeout) was still waiting. Waiting, while there is still data to read, you mean? > Should this be before or after the pthread_hurd_cond_timedwait_np? By: > > device_read_reply_inband get called with errorcode == D_WOULD_BLOCK? > > I wonder if in that case we'd be supposed to call start_input again, I didn't mean to add calling start_input() in io_select_common, but in device_read_reply_inband(), in the case where errorcode == D_WOULD_BLOCK. Because we are consuming the opportunity of reading data from the device, so we'd have to trigger another one by calling start_input() to make another device_read RPC call. > Also do we need to acquire the mutex_lock on every iteration of the loop? We have to keep the mutex acquired whenever we act on the global data such as the input_buffer, etc. Samuel
