> Le 2 oct. 2015 à 17:59, Jens Axboe <[email protected]> a écrit :
>
> On 10/02/2015 04:04 PM, Fabrice Bacchella wrote:
>> When writing my new hdfs engine, I met a problem with IO merge.
>>
>>
>> Is there a way to prevent that in fio, or is that up to my engine to manage
>> that and merge IO ?--
>
> It sounds like a bug in the fio accounting, for the case of short
> reads/writes. That doesn't happen very often elsewhere, so not unreasonable
> to expect that is the case. Feel free to poke around and figure it out. Let
> me know if that doesn't work out, and I'll take a stab at fixing it up.
>
I'm using this simple fio script:
[global]
size=1m
ioengine=net
hostname=localhost
port=8765
filename=localhost,8765,tcp
[job1]
rw=write
bs=<something>
numjobs=1
And launching a fio listener with :
./fio --server
With ./fio --debug=io,file sample.fio and bs=654820, I got:
io 15106 ->prep(0x779f40)=0
io 15106 queue: io_u 0x779f40: off=0/len=654820/ddir=1/localhost,8765,tcp
io 15106 io complete: io_u 0x779f40:
off=0/len=654820/ddir=1/localhost,8765,tcp
...
io 15705 fill_io_u: io_u 0x1926f40:
off=654820/len=654820/ddir=1/localhost,8765,tcp
io 15705 prep: io_u 0x1926f40:
off=654820/len=654820/ddir=1/localhost,8765,tcp
issued : total=r=0/w=2/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
It fails for the second IO with
job1: (groupid=0, jobs=1): err=104 (file:engines/net.c:684, func=xfer,
error=Connection reset by peer): pid=15106: Tue Oct 6 16:11:59 2015,
but that's not the problem. It does 2 654820 bytes IO, as requested
With bs=654821 and adding strace -f -e trace=sendto , I got:
io 15157 prep: io_u 0x1660f40: off=0/len=654821/ddir=1/localhost,8765,tcp
io 15157 ->prep(0x1660f40)=0
io 15157 queue: io_u 0x1660f40: off=0/len=654821/ddir=1/localhost,8765,tcp
[pid 15157] sendto(3,
"\220\240@\6\371\341\277>\0\0\0\0\0\0\0\0\202\2\7\320\343\6H\26P\340\277\370\330\30e\30"...,
654821, MSG_MORE, NULL, 0) = 654820
io 15157 requeue 0x1660f40
io 15157 io_u_queued_completed: min=1
io 15157 getevents: 0
io 15157 prep: io_u 0x1660f40:
off=654820/len=654821/ddir=1/localhost,8765,tcp
io 15157 ->prep(0x1660f40)=0
io 15157 queue: io_u 0x1660f40:
off=654820/len=654821/ddir=1/localhost,8765,tcp
and it still failling, fio is doing another IO, at the good offset, but wrong
length, I think it should now try to read 1 byte. It try a second IO but not to
finish the first one.
With bs=1m, I got:
io 15205 prep: io_u 0x862f40: off=0/len=1048576/ddir=1/localhost,8765,tcp
io 15205 ->prep(0x862f40)=0
io 15205 queue: io_u 0x862f40: off=0/len=1048576/ddir=1/localhost,8765,tcp
[pid 15205] sendto(3,
"\220\240@\6\371\341\277>\22\24\200\320\36y\313\26\202\2\7\320\343\6H\26P\340\277\370\330\30e\30"...,
1048576, 0, NULL, 0) = 654820
io 15205 requeue 0x862f40
io 15205 io_u_queued_completed: min=1
issued : total=r=0/w=1/d=0, short=r=0/w=1/d=0, drop=r=0/w=0/d=0
So only 654820 bytes are read, the IO is requeued, but not send.
For sequential IO, the result is not totally wrong and the net engine won't do
random IO :
fio: network IO can't be random
But with my libhdfs engine, I'm getting wrong results, because it can do random
IO and segmented random IO are very different than segmented serial IO. Other
engine like rbd might have the same problem.
--
To unsubscribe from this list: send the line "unsubscribe fio" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html