On 03/07/2014 04:13 AM, David Bierce wrote:
Ello —
I’ve been watching with great eagerness at the design and features of ceph
especially compared to the current distributed file systems I use. One of the
pains with VM work loads is when writes stall for more than a few seconds,
virtual machines that think they are communicating with a real live block
device generally error out their file systems, in the case of ext? they remount
as read only, with file and operating systems the behaviors for that scenario
is…erratic at best.
It looks like the default write timeout for an OSD is 30 seconds. With the
write consistency behavior that ceph has, does than mean a write could be
stalled by the client for up to 30 seconds in the event of an OSD failing to
write, for whatever reason? If that is the case, is there a way around such a
long timeout in block device terms short of 1 second checks?
What timeout are you looking at? Since by default librados/librbd block
for ever, so there shouldn't be a timeout.
I've had multiple VMs hang for hours at a time when I broke a Ceph
cluster and after fixing it the VMs would start working again.
They only reported some "task blocked for more then 120 seconds"
messages in their dmesg, but that's all.
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Wido den Hollander
42on B.V.
Phone: +31 (0)20 700 9902
Skype: contact42on
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com