spanglerco opened a new issue, #80:
URL: https://github.com/apache/arrow-dotnet/issues/80

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   When given an empty stream, calls to `ArrowStreamReader.GetSchema()` first 
calls `ArrowStreamReaderImplementation.GetSchemaAsync()` and then retrieves the 
`Schema` property. When the stream is empty, `GetSchemaAsync` [returns 
early](https://github.com/apache/arrow-dotnet/blob/425bc83d1800637b61775d257216b7bfe1a6ca25/src/Apache.Arrow/Ipc/ArrowStreamReaderImplementation.cs#L166)
 without setting `_schema`.
   
   Then, the [`Schema` property 
getter](https://github.com/apache/arrow-dotnet/blob/425bc83d1800637b61775d257216b7bfe1a6ca25/src/Apache.Arrow/Ipc/ArrowReaderImplementation.cs#L39),
 seeing that `_schema` is still null, calls 
`ArrowStreamReaderImplementation.GetSchema()` (the synchronous version). This 
ends up performing a synchronous read on the (still empty) stream.
   
   In particular, this is an issue when the stream is an HTTP request body from 
Kestrel (ASP.NET), which blocks synchronous IO by default. The result is an 
`InvalidOperationException`.
   
   I'm willing to provide a PR if someone else doesn't get to it. My initial 
reaction would be to change `ArrowReaderImplementation.GetSchemaAsync()` to 
return a `ValueTask<Schema>`, but that's not the only possible solution.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to