On Fri, Jun 10, 2011 at 03:21:38AM -0500, Jonathan Nieder <jrnie...@gmail.com> wrote: > Indeed, read(2) does the same thing (truncates to 7ffff000) and has done
What the fuck, it's buggy, indeed: read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3298534883328) = 2147479552 > so for five years, though it's a little harder to notice (I had to use > mmap to create a large file-backed buffer to read into.) Well, for read, the situation is a bit different, because thats a clear posix violation. While this is obviously not relevant for sendfile, it of course makes sense to use posix (or simply traditional unix) I/O semantics for sendfile as well *iff* read implements posix behaviour. > Background: even with read(2) and write(2), partial progress does not > necessarily represent an error. Thats true _in general_, but in posix/unix/sus it has clearly defined and user-visible semantics for that, which require that the success case transfers as many bytes as can be transferred, and not stop a random amount earlier unless there is an error condition (signal => EINTR if not restarted, and easily controllable by applications - for example, not doing anything with signals makes it work): The value returned may be less than nbyte if the number of bytes left in the file is less than nbyte, if the read() request was interrupted by a signal, or if the file is a pipe or FIFO or special file and has fewer than nbyte bytes immediately available for reading. In this case, the size of the file was also the sendfile transfer size. If read() was used, and read wouldn't be non-posix, then a partial read necessarily indicates an error of some kind. > For example, on "slow" devices like a terminal, pipe, or socket, a > partial success can indicate interruption by a signal, and on a named > or unnamed pipe it can indicate that fewer than the requested number of > bytes were immediately available. A file is not a terminal, pipe or socket - I specifically reported a file->file problem, not a socket->file or file->socket problem (the former is probably not supported and the latter has a whole lot different error modes/behvaiour). For files, unless you do signal stuff, a partial (posix-) read indicates error or end of file. Applications are well aware of the differences between sockets and files for example (set nonblocking mode for example to see very different behaviour). The biggets difference to your example and my exampel however is that the posix requireemnts are so strong, and the unix behaviour has been working for so long, that applications that have short writes or reads when they know there is more data can rightfully abort the process. With sockets, posix semantics are different, so the normal behaviour of applications is to retry. This works fine as long as the OS follows posix. > So I am somewhat curious about these many programs --- why are they > expecting this from sendfile? Because thats how every unix works now and has in the past, and thats what the unix standard requires. I think it is reasonable for programs to expect sendfile to behave like a synthesized read+write, as opposed to "weird" semantics, and I think it is reasonable for read() to follow the posix semantics nowadays, it shouldn't be that hard to implement it, and the portability gain from having posix behaviour is immense. That linux apparently fails to implement this for read too, makes it consistent (which is kind of good), but creates a portability problem for unix programs. > The manpage is outdated and does not even indicate that sendfile can be > used to copy a file. My manpage clearly alows it: sendfile() copies data between one file descriptor and another. There are no other hard requirements listed for sendfile. It mentions that in 2.6.9, there are extra requirements, but thats obviously not relevant to (and untrue for) 2.6.39. That's no different to read or write, both of which also work on file descriptors and put no other requirements on them. Since files are accessible via file descriptors, the sendfile manpage clearly says it can be used to copy a file (or more correctly, to transfer data from one file to another). However, the manpage says: Applications may wish to fall back to read(2)/write(2) in the case where sendfile() fails with EINVAL or ENOSYS. And this is in fact what many applications do, try it, and then fall back. It's also common sense, and the rationale behind the design (cf. Linuses mails on that topic) - sendfile should implement what the kernel can do more efficiently, and otherwise signal the application that it should do it itself (EINVAL). The expected applciation behaviour is just that: flal back to read/write, and this worked in the past. The problem is precisely that sendfile changed semantics. > Has the size allowed for a single sendfile(2) call changed over time? Implementations following the manpage worked in the past, yes, because read or write emulation usually uses smaller than 2gb buffers (and if read(2) would be fixed, it would even work with larger buffer sizes). The size allowed for a single sendfile was about 0 in earlier versions, because they returned EINVAL. > Is this a regression or a request for a new feature? It seems there are two regressions: read no longer being posix compliant and sendfile no longer telling applications to use a (working) read/write loop but instead attempting the copy itself. > If an application wants to print a useful error message, it has to try > again until sendfile returns -1 so errno can be set. Thats clearly just an opinion. The authors of gnu tar and many existing applications apparently disagree, as do I. It's widespread behaviour to expect posix semantics nowadays, and quite reaosnable to expect similar behaviour by sendfile. In fact, great peril has been brought over the world by introducing so horribly misdesigned interfaces such as epoll() (and, to a lesser extent, similar mechanisms in other kernels), that creating consistency in the form of using posix semangics for any file I/O is clearly a good thing. But that's just my opinion :) > Anyway, I agree that it would be better for sendfile to return partial > results less often, I think sendfile should follow the same semantics as unix read(), and further, linux should follow both defacto historical unix behaviour as well as posix/sus behaviour and not return partial results in cases not allowed by posix. > to make one-off programs easier to write and to decrease the number of > syscalls made, but that doesn't seem worth The whole *point* of sendfile is to decrease the number of syscalls, for high-performance programs. If overhead isn't an issue, then read+write are much more portable, and typically easier to use. As such, if sendfile requires extra unnecessary syscalls, this is clearly a design violation. > write more than fits in an "int" at a given moment. So I'm marking this > wontfix for now. Should I open a separate bug for read(2) then, or will posix compliance also be a wontfix (a valid position)? > An obvious possible improvement would be to update the manpages to > include information about this. Would you be interested in that, and if > so, can you suggest a wording? I guess something like that would be fine: sendfile is not the same as a read+write combination - it may transfer and return fewer bytes than requested for no user-visible reason. that would require read(2) to be fixed. A warning that read doesn't implement posix semantics for file I/O and also errornously might return partial results might be very useful, too. -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / schm...@schmorp.de -=====/_/_//_/\_,_/ /_/\_\ -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org